[BioC] Inconsistent annotation of affy probeset on Affymetrix chip for rat: 230.2

Marc Carlson mcarlson at fhcrc.org
Wed Jul 2 19:47:27 CEST 2008

Christoph Preuss wrote:
> Hi everyone,
> We analyzed a global exression microarray data set using gcrma for the
> normalization step and limma for finding differentially expressed
> genes. One of the most significant probesets (ProbeSetID annotation
> "1375535_at") in terms of d.e is annotated as  :
> Probeset "1375535_at"
> -Gene Symbol: Lpin1
> - Location: Chr 6
> in the bioconductor package  "rat2302" / "rat2302.db".
> We also looked at the Affymetrix web site, where the same probeset was
> annoted as "Transcribed sequence" on chromosome X.
> Affymetrix Annotation RG 230 2.0 Chip:
> -ProbeSetID:	1375535_at
> -Target Sequence:	
>> RAT230_2:1375535_AT
> gaagttagagagctgtttccccactttacattttaaaatatgtatgccaggatntaatca
> ttcctttaagtgtacacttcaaggagagatgtgccgaataagaaaatagctttctctagc
> gtgaagggttttgcgtccgccgagttcttaaggtcttttttaagagctactgtgtatgag
> tgtgtgtatgtgtgcgcatgcatgttcctgcgactagtcattcattcacatggtgatcag
> acaacaatgggagctggttcgtctaccttatcttgtgggtcctggagttcaatctcagat
> catcaggctgggcagcaagtgccttcaccctccgagccatcttgccatcccacagctgag
> cgtctaatatgacattgccgatga
> Interestingly, the given target sequence for the probeset matches only
> a mouse sequence and not even a rat mRNA (blastn search).
> The question is which annotation should we trust?
> Is there any chance to validate the probeset annotation?
> Many thanks in advance for any help.
> cheers,
> Christoph Preuss
> (Leibniz-Institute for Arteriosclerosis Research, University of
> Muenster Germany )
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Hi Christoph,

I can only really speak for the Bioconductor annotations which are 
generated from public sources along with an initial mapping of the probe 
or probeset to a public accession (usually this is a Genbank, Entrez ID 
or a related type of ID).  In the case of  "1375535_at", the probeset is 
an Affymetrix probeset and so we are ultimately at the mercy of 
Affymetrix to accurately tell us what this probeset is in this initial 
mapping, but after this we do the rest ourselves by using public 
sources.  We map the probeset to ID information onto additional 
information gathered from public sources (primarily NCBI) to get the 
rest of the information in the package.  The file that you get from 
Affymetrix may also have a lot of the same data as our packages, but 
unless they describe it somewhere, I don't think we actually know for 
certain where they collected all of their information from.  The only 
information that we ever actually take from them is the initial mapping 
of their probeset onto a public accession.

I dug up the latest Affymetrix mapping files that we used to generate 
this package and investigated.  From the file that I have (which was 
collected in late March) the probeset you listed is indicated to be 
Lpin1, and also to be located on Chromosome 6 which agrees completely 
with the information that we gathered from NCBI  and GoldenPath from 
this time.  As of this morning, NCBI still lists this gene as being 
Lipin1 and being located on Chromosome 6.  However, there is also a 
field right next to that in the Affymetrix file that is called 
"Alignments" which lists the X chromosome.  But when I pull up an even 
more recent file from Affymetrix, then I see that they no longer list 
the location of this gene and have now replaced that value with a "---", 
they also no longer list the genes name or symbol.  But they still list 
Chromosome "X" in the alignment field and have even assigned different 
accessions to this probeset. 

So the short answer is that Affymetrix has changed their mind about what 
they are claiming this probeset is measuring.

I hope this helps you,


More information about the Bioconductor mailing list