[BioC] Multiple genes per probeset

Richard Friedman friedman at cancercenter.columbia.edu
Wed Apr 21 23:41:45 CEST 2010


Dear Bioconductor list,

	Some time ago I downloaded a Mouse annotation database from the  
Affymetrix web site (Version 29 according to my notes).
It occasionally contains more than one gene per probeset separated by  
3 slashes. For example for the probeset
1415691_at the genes listed are Dlg1 /// LOC100047603. Presumably this  
is because the probeset is found to have
specificity for more than one gene.

	When I get the annotation from mouse4302.db (version  2.2.11) using  
the annaffy package I get only one gene
per probeset for those probesets that have more than one gene in the  
Affy package. For example, for the case of
1415691_at it only lists Dig1.

	Is this because reannotation and mapping has found the gene most  for  
which the probeset is most specific?
If that is the case can you point me at a reference as to how this is  
done?
The paper  Dai et al. Evolving gene/transcript definitions  
significantly alter the interpretation of GeneChip data. Nucleic Acids  
Res. 2005 Nov 10;33(20):e175. deals with this issue, but as best I can  
understand it they assign different probeset names.

	Or should there be multiples genes per probeset on output?

Thanks and best wishes,
Rich
------------------------------------------------------------
Richard A. Friedman, PhD
Associate Research Scientist,
Biomedical Informatics Shared Resource
Herbert Irving Comprehensive Cancer Center (HICCC)
Lecturer,
Department of Biomedical Informatics (DBMI)
Educational Coordinator,
Center for Computational Biology and Bioinformatics (C2B2)/
National Center for Multiscale Analysis of Genomic Networks (MAGNet)
Room 824
Irving Cancer Research Center
Columbia University
1130 St. Nicholas Ave
New York, NY 10032
(212)851-4765 (voice)
friedman at cancercenter.columbia.edu
http://cancercenter.columbia.edu/~friedman/

In Memoriam,
Philip Klass



More information about the Bioconductor mailing list