[BioC] Illumina annotation packages discrepancy

Lynn Amon lamon at fhcrc.org
Wed Dec 3 18:49:35 CET 2008


You can match the Array_Address_Id in the bead manifest file to the
probe ID in the annotation package.
Lynn

Renaud Gaujoux wrote:
> Ok.
> Since I only have the summarized data I guess I cannot get back the
> ids from the scanner (?)
> For the moment I combine the data from the new lumiHumanAll and the
> old lumiHumanV2 when the new package does not give annotation.
>
> Thanks,
> Renaud.
>
> Lynn Amon wrote:
>> The probe IDs are the identifiers in the .txt or .csv files written by
>> the scanner not the output by BeadStudio.
>> Lynn
>>
>> Renaud Gaujoux wrote:
>>  
>>> I just had a quick try but just got NAs. Should the code below work
>>> with this package?
>>>
>>> entrez <- getEG(probeids, 'illuminaHumanv2ProbeID.db')
>>>
>>> which wraps:
>>>
>>> unlist(lookUp(probeids, 'illuminaHumanv2ProbeID.db', "ENTREZID"))
>>>
>>> I tried with probeids being Illumina full IDs, Illumina trimmed IDs
>>> (without ILMN_), and with nuIDs.
>>>
>>> Thanks,
>>> Renaud
>>>
>>> Lynn Amon wrote:
>>>    
>>>> You'll want to use the illuminaHumanv2ProbeID.db package.
>>>> Lynn
>>>>
>>>> Renaud Gaujoux wrote:
>>>>      
>>>>> Oups... I'm really sorry Mark for the confusion. I think misread the
>>>>> vignette.
>>>>>
>>>>> I BLASTed some of the missing probes and some of them gave quite
>>>>> convincing results (100% identity but with different variants),
>>>>> others didn't return any sequence. So I'll try with the package from
>>>>> 2.2.
>>>>>
>>>>> Thanks again,
>>>>> Renaud
>>>>>
>>>>> Lynn Amon wrote:
>>>>>        
>>>>>> The illuminaHumanv2.db package is not a "proprietary" package.  It
>>>>>> is currently maintained by Mark Dunning
>>>>>> (Mark.Dunning at cancer.org.uk).  It is based on BLASTed sequences but
>>>>>> there was a problem in creating the package when more than one
>>>>>> accession was assigned to a probe which caused the annotation
>>>>>> program to skip all those probes which is why you are finding so
>>>>>> many without annotation.  You should contact Mark to find out if
>>>>>> that problem was corrected and a new version released.  You could
>>>>>> also try using 2.2 release which I created and has annotation for
>>>>>> all those probes.
>>>>>> Lynn
>>>>>>
>>>>>>
>>>>>> Renaud Gaujoux wrote:
>>>>>>          
>>>>>>> Hi Pan,
>>>>>>>
>>>>>>> thanks for your answer. I've been (and still am) struggling a bit
>>>>>>> to get consistent and up to date annotation for my data.
>>>>>>>
>>>>>>> So, I guess it is more reliable to use the lumiHumanAll.db package?
>>>>>>>
>>>>>>> However, what about the probes that are note annotated in
>>>>>>> lumiHumanAll but look like interesting for my study (i.e.
>>>>>>> appearing in my top lists for differential expression or
>>>>>>> classification power).
>>>>>>> I've got such probes that are annotated neither packages
>>>>>>> lumiHumanAll.db nor in lumiHumanV2 but are in illuminaHumanv2.
>>>>>>>
>>>>>>> Hence no package give me consistent annotation for my top genes.
>>>>>>> However I've got an annotation file (that came with the array
>>>>>>> data, I guess output by BeadStudio) that gives me annotations for
>>>>>>> all of my probes. But as you mentioned, these might be outdated,
>>>>>>> which actually bothers me. Any suggestion about that?
>>>>>>>
>>>>>>> By the way, how come that even Illumina "proprietary" packages
>>>>>>> (illuminaHumanv2.db) don't annotate correctly their own probes? :(
>>>>>>>
>>>>>>> Thanks again for your help and clarification, and the lumi package.
>>>>>>>
>>>>>>> Renaud
>>>>>>>
>>>>>>>
>>>>>>> Pan Du wrote:
>>>>>>>            
>>>>>>>> Hi Renaud,
>>>>>>>>
>>>>>>>> The reason of discrepancy is due to the different mapping
>>>>>>>> criteria. Both
>>>>>>>> "lumiHumanAll.db" and "illuminaHumanv2.db" libraries are based on
>>>>>>>> Blasting
>>>>>>>> result of RefSeq database. The "lumiHumanAll.db" library is nuID
>>>>>>>> indexed and
>>>>>>>> includes all the probes of different versions. For the mapping
>>>>>>>> from probe to
>>>>>>>> RefSeq, it defined both sensitivity and specificity (see the
>>>>>>>> vignette
>>>>>>>> "IlluminaAnnotation.Rnw" in the lumi package). As a result, it
>>>>>>>> might include
>>>>>>>> less mapping than "illuminaHumanv2.db" because "lumiHumanAll.db"
>>>>>>>> filtered
>>>>>>>> out some dubious mappings (e.g., one probe has multiple perfect
>>>>>>>> mapping.)
>>>>>>>>
>>>>>>>> The "lumiHumanV2" library was built based on the original
>>>>>>>> annotation by
>>>>>>>> Illumina company. As a result, it has much more probe mappings.
>>>>>>>> However,
>>>>>>>> many mappings might be outdated because of the updates of the
>>>>>>>> genome
>>>>>>>> annotation.
>>>>>>>>
>>>>>>>> Hope this will clarify the confusion.
>>>>>>>>
>>>>>>>>
>>>>>>>> Pan
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/28/08 5:00 AM, "bioconductor-request at stat.math.ethz.ch"
>>>>>>>> <bioconductor-request at stat.math.ethz.ch> wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>              
>>>>>>>>> Date: Thu, 27 Nov 2008 16:03:36 +0200
>>>>>>>>> From: Renaud Gaujoux <renaud at mancala.cbio.uct.ac.za>
>>>>>>>>> Subject: [BioC] Illumina annotation packages discrepancy
>>>>>>>>> To: bioconductor at stat.math.ethz.ch
>>>>>>>>> Message-ID: <492EA8B8.5000400 at cbio.uct.ac.za>
>>>>>>>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>>>>>>>
>>>>>>>>> Hi list,
>>>>>>>>>
>>>>>>>>> I've got BeadSummary data from Illumina (Array content:
>>>>>>>>> HUMANREF-8_V2_11223162_B.XML.xml).
>>>>>>>>> I imported it in R using the function lumi.batch.
>>>>>>>>> This automatically computed the nuID for each probe and set the
>>>>>>>>> annotation package to lumiHumanAll.db.
>>>>>>>>> This is all good.
>>>>>>>>>
>>>>>>>>> BUT, when I do
>>>>>>>>>
>>>>>>>>> lookUp(nuIDs, 'lumiHumanAll.db', 'GENENAME')
>>>>>>>>>
>>>>>>>>> I get 2921out of 20589 probes with NA.
>>>>>>>>>
>>>>>>>>> If I do the same using the old annotation package lumiHumanV2:
>>>>>>>>>
>>>>>>>>> lookUp(nuIDs, 'lumiHumanV2', 'GENENAME')
>>>>>>>>>
>>>>>>>>> I get 454 out of 20589 probes with NA.
>>>>>>>>>
>>>>>>>>> Finally, if I do the same using the annotation package
>>>>>>>>> illuminaHumanv2.db (but based on the corresponding TargetIDs):
>>>>>>>>>
>>>>>>>>> lookUp(targetIDs, 'illuminaHumanv2.db', 'GENENAME')
>>>>>>>>>
>>>>>>>>> I get 2041out of 20589 probes with NA.
>>>>>>>>>
>>>>>>>>> Can anybody give me an explanation for that discrepancy? And what
>>>>>>>>> annotation package I should use as it looks like some
>>>>>>>>> interesting probes
>>>>>>>>> (for my experiment) don't have annotation in the new version?
>>>>>>>>>
>>>>>>>>> Also I could not find any reference to that
>>>>>>>>> HUMANREF-8_V2_11223162_B
>>>>>>>>> annotation (neither on Illumina website nor in Bioconductor
>>>>>>>>> packages). I
>>>>>>>>> only found information about HUMANREF-8_V2_11223162_A. Is the
>>>>>>>>> letter
>>>>>>>>> suffix (A or B) really important?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                     
>>>>>>>> ------------------------------------------------------
>>>>>>>> Pan Du, PhD
>>>>>>>> Research Assistant Professor
>>>>>>>> Northwestern University Biomedical Informatics Center
>>>>>>>> 750 N. Lake Shore Drive, 11-176
>>>>>>>> Chicago, IL  60611
>>>>>>>> Office (312) 503-2360; Fax: (312) 503-5388
>>>>>>>> dupan (at) northwestern.edu
>>>>>>>> ------------------------------------------------------
>>>>>>>>  
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>               
>>>>>>> _______________________________________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>             
>>
>>   
>



More information about the Bioconductor mailing list