[BioC] Illumina annotation packages discrepancy
renaud at mancala.cbio.uct.ac.za
Mon Dec 1 10:39:11 CET 2008
thanks for your answer. I've been (and still am) struggling a bit to get
consistent and up to date annotation for my data.
So, I guess it is more reliable to use the lumiHumanAll.db package?
However, what about the probes that are note annotated in lumiHumanAll
but look like interesting for my study (i.e. appearing in my top lists
for differential expression or classification power).
I've got such probes that are annotated neither packages lumiHumanAll.db
nor in lumiHumanV2 but are in illuminaHumanv2.
Hence no package give me consistent annotation for my top genes. However
I've got an annotation file (that came with the array data, I guess
output by BeadStudio) that gives me annotations for all of my probes.
But as you mentioned, these might be outdated, which actually bothers
me. Any suggestion about that?
By the way, how come that even Illumina "proprietary" packages
(illuminaHumanv2.db) don't annotate correctly their own probes? :(
Thanks again for your help and clarification, and the lumi package.
Pan Du wrote:
> Hi Renaud,
> The reason of discrepancy is due to the different mapping criteria. Both
> "lumiHumanAll.db" and "illuminaHumanv2.db" libraries are based on Blasting
> result of RefSeq database. The "lumiHumanAll.db" library is nuID indexed and
> includes all the probes of different versions. For the mapping from probe to
> RefSeq, it defined both sensitivity and specificity (see the vignette
> "IlluminaAnnotation.Rnw" in the lumi package). As a result, it might include
> less mapping than "illuminaHumanv2.db" because "lumiHumanAll.db" filtered
> out some dubious mappings (e.g., one probe has multiple perfect mapping.)
> The "lumiHumanV2" library was built based on the original annotation by
> Illumina company. As a result, it has much more probe mappings. However,
> many mappings might be outdated because of the updates of the genome
> Hope this will clarify the confusion.
> On 11/28/08 5:00 AM, "bioconductor-request at stat.math.ethz.ch"
> <bioconductor-request at stat.math.ethz.ch> wrote:
>> Date: Thu, 27 Nov 2008 16:03:36 +0200
>> From: Renaud Gaujoux <renaud at mancala.cbio.uct.ac.za>
>> Subject: [BioC] Illumina annotation packages discrepancy
>> To: bioconductor at stat.math.ethz.ch
>> Message-ID: <492EA8B8.5000400 at cbio.uct.ac.za>
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>> Hi list,
>> I've got BeadSummary data from Illumina (Array content:
>> I imported it in R using the function lumi.batch.
>> This automatically computed the nuID for each probe and set the
>> annotation package to lumiHumanAll.db.
>> This is all good.
>> BUT, when I do
>> lookUp(nuIDs, 'lumiHumanAll.db', 'GENENAME')
>> I get 2921out of 20589 probes with NA.
>> If I do the same using the old annotation package lumiHumanV2:
>> lookUp(nuIDs, 'lumiHumanV2', 'GENENAME')
>> I get 454 out of 20589 probes with NA.
>> Finally, if I do the same using the annotation package
>> illuminaHumanv2.db (but based on the corresponding TargetIDs):
>> lookUp(targetIDs, 'illuminaHumanv2.db', 'GENENAME')
>> I get 2041out of 20589 probes with NA.
>> Can anybody give me an explanation for that discrepancy? And what
>> annotation package I should use as it looks like some interesting probes
>> (for my experiment) don't have annotation in the new version?
>> Also I could not find any reference to that HUMANREF-8_V2_11223162_B
>> annotation (neither on Illumina website nor in Bioconductor packages). I
>> only found information about HUMANREF-8_V2_11223162_A. Is the letter
>> suffix (A or B) really important?
> Pan Du, PhD
> Research Assistant Professor
> Northwestern University Biomedical Informatics Center
> 750 N. Lake Shore Drive, 11-176
> Chicago, IL 60611
> Office (312) 503-2360; Fax: (312) 503-5388
> dupan (at) northwestern.edu
More information about the Bioconductor