[BioC] How are GO2PROBE built

john seers (IFR) john.seers at bbsrc.ac.uk
Thu Oct 2 14:18:29 CEST 2008


Hi Sean

Turning this into a more general question. Whenever I have to deal with
a new type of Affymetrix array I seem to have to root around
Bioconductor packages to find out how it is annotated etc. By the time I
come around to do it again it has all changed and is done in a different
way to how it was done before. My difficulty is it all feels a bit adhoc
and comes at me in bits and pieces. Also I always feel there is probably
a better way to do it that I am missing.

Is there anywhere information that gives a better big picture that pulls
it together a bit? What are the foundation designs/philosophy that all
the packages are following? Is there a routemap type document that
describes Bioconductor's approach to all this? 

Any pointers to useful information gratefully received.


John Seers


-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch
[mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Sean Davis
Sent: 02 October 2008 11:55
To: Oura Tomonori
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] How are GO2PROBE built

On Thu, Oct 2, 2008 at 3:11 AM, Oura Tomonori <tomonori.oura at gmail.com>
> Dear BioC,
> How are the mappings of Affymetrix probe ids to Gene Ontology terms in

> metadata package provided by Bioconductor build?
> I am trying to use some gene set analysis packages and find some 
> pakage use the *GO2PROBE (ex. hgu133aGO2PROBE) information, but 
> another package use the external gene set definition, such as MSigDB.
> So I want to know the criteria for select specific GO term among 
> possible terms for each probe id in Bioconductor.
> I already read the documents about AnnBuilder package, however.

To make a long story short, the annotations available from affy are
mapped to Entrez Gene IDs.  Then, the information from Entrez Gene--in
this case, gene ontology--is mapped to affy id.  The dates associated
with the data, the source of the data, and how the data are mapped will
all affect the final mapping of affy ID to gene ontology.  The nice
thing about gene ontology analyses is that they are typically based on
"sets" of genes making it much less important to start with EXACTLY the
same gene ontology mappings.  In fact, in practice, it will be pretty
difficult to do so.

If you want to see the details of the current Bioconductor annotation
package build process, you want to read the AnnotationDbi SQLForge
vignette, as AnnBuilder is outdated.

Finally, if I have misunderstood your question, perhaps you could


Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
Search the archives:

More information about the Bioconductor mailing list