[BioC] Annotation of U95av2 array

James W. MacDonald jmacdon at med.umich.edu
Wed Apr 18 22:10:28 CEST 2007


Hi George,

I would use Entrez Gene IDs to do the matching. You could also use the 
mappings that Affy provide.

http://www.affymetrix.com/support/technical/byproduct.affx?product=hg-u133-plus

I have never used them, but they may well be useful.

Best,

Jim


Tseng, George C. wrote:
> Hi Jim,
> 
> Your clarification is very helpful. Then when we try to match genes
> from two types of arrays (say U95 and U133), what would you
> recommend? We originally thought Unigene ID would be a good choice
> but it would become difficult if one probe set maps to multiple IDs.
> Can you advise? Sorry, it may be a dumb question but I'm from
> statistics background.
> 
> Thanks.
> 
> George
> 
> -----Original Message----- From: James W. MacDonald
> [mailto:jmacdon at med.umich.edu] Sent: Wednesday, April 18, 2007 9:51
> AM To: Tseng, George C. Cc: bioconductor at stat.math.ethz.ch Subject:
> Re: [BioC] Annotation of U95av2 array
> 
> Hi George,
> 
> Please don't take list conversations off-list. The list archives are
>  intended to be a source of information, and on the off chance that I
>  might say something useful, it would be nice if people could find
> this later.
> 
> As to your question, as I said below, we just map things from Entrez
>  Gene to the other annotation sources, so whatever Entrez Gene says,
> we report. So if I grep out some probeset ID that maps to multiple
> UniGene IDs, I might get something like 35566_f_at, which maps to 5
> UG IDs.
> 
> Now if I get the Entrez ID (3576), go to the Entrez Gene webpage for
>  this ID, and scroll to the very bottom, I see five UniGene IDs that
> this Entrez Gene ID corresponds to. We report four of these five, the
> only difference being we report Hs.443948 instead of Hs.654584.
> 
> This is obviously a mistake because Hs.443948 is SLC4A1 instead of
> IL-8, but the hgu95av2 package was built on March 15, so maybe Entrez
> Gene has corrected this mistake in the interim.
> 
> See 
> http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=full_report&list_uids=3576
> 
> 
> Best,
> 
> Jim
> 
> 
> Tseng, George C. wrote:
> 
>> Jim,
>> 
>> Thanks so much for your response. I have one further question. In 
>> your annotation in Bioconductor, a probe set can map to multiple 
>> unigene ID. This really confuses me. Shouldn't it be only one ID?
>> 
>> George
>> 
>> -----Original Message----- From: James MacDonald 
>> [mailto:jmacdon at med.umich.edu] Sent: Sunday, April 01, 2007 9:59 AM
>>  To: Tseng, George C. Cc: biocannotation at lists.fhcrc.org; Lu,
>> Shu-Ya Subject: Re: Annotation of U95av2 array
>> 
>> Hi George,
>> 
>> Tseng, George C. wrote:
>> 
>> 
>>> Dear Dr. MacDonald and other Biocore Data Team members,
>>> 
>>> I'm using your array annotations from Bioconductor in my research
>>>  and I teach it in my microarray course as well. It is indeed a 
>>> great tool for our data analysis and methodological development. 
>>> Recently we're working on a meta-analysis research project to 
>>> incorporate information from multiple data sets. My student took 
>>> the Unigene ID annotations in all the U95av2 probes and compared 
>>> with the result obtained from the Affymetrix website (the batch 
>>> search in NetAffy). Among the 9704 probes annotated in 
>>> Bioconductor, 724 probes were annotated completely differently in
>>>  NetAffy.
>>> 
>>> My question is: Do you obtain your Unigene ID annotation from 
>>> Affymetrix database or other source? NetAffy annotations always 
>>> have one Unigene ID to a probeset while your annotationis can
>>> have many. Can you give us some detail about your annotation
>>> procedure?
>> 
>> 
>> Nianhua Li makes the annotation packages, so she would be the final
>>  trusted source.
>> 
>> In the past, the process was to map Affy ID to Entrez Gene ID using
>>  the annotation files that Affy supply on their website. We then
>> use AnnBuilder to do the mappings from Entrez Gene to all other 
>> annotation sources, so it is not inconceivable that we would have 
>> different UniGene IDs for a given probeset.
>> 
>> In my experience, the BioC annotations are more up to date and 
>> accurate than what Affy supply either on Netaffx or in their 
>> annotation files. This is based on blatting the probe sequences.
>> 
>> Best,
>> 
>> Jim
>> 
>> 
>> 
>> 
>>> Thanks!
>>> 
>>> George
>>> 
>>> ============================================ George C. Tseng 
>>> Assistant Professor Dept of Biostatistics and Human Genetics, 
>>> University of Pittsburgh http://www.pitt.edu/~ctseng, 
>>> 412-624-5318 ============================================
>> 
>> 
> 
> 


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues.



More information about the Bioconductor mailing list