[BioC] hgu133plus2 GO issues

Seth Falcon sfalcon at fhcrc.org
Tue Apr 18 18:37:02 CEST 2006

Hi Jake,

Jake <jjmichael at comcast.net> writes:
> Could someone please help me understand the differences between the
> (hgu133plus2)GO, GO2PROBE, GO2ALLPROBES?  I've found discepancies that I
> can't quite explain:
>  > mget("GO:0042611", hgu133plus2GO2PROBE)
> Error: value for 'GO:0042611' not found

GO annotates probe ids (really Entrez Gene ids) at the most specific
term in the GO ontology.  In the above search of hgu133plus2GO2PROBE,
you are seeing that GO:0042611 does not have any annotations.

>> mget("GO:0042611", hgu133plus2GO2ALLPROBES)
> $"GO:0042611"
>           <NA>            IEA            IEA            IEA
> <NA>
>    "209309_at"  "217014_s_at"    "210325_at"  "218831_s_at"

For a given GO term, the hgu133plus2GO2ALLPROBES environment is giving
you all Affy ids that map to this GO term _or_ a more specific term
that is related to this term (by related, I mean child-like relation,
where there is a path in the DAG connecting the terms).

The names on the vector are evidence codes.  See the man pages for

So for the above two cases, this is as expected and I don't think
there is any inconsistency.  

> and finally...
> ### "208729_x_at" is one of the probes returned with the above command
>> grep("GO:0042611",unlist(mget("208729_x_at", hgu133plus2GO)))
> numeric(0)

When you say "above command", which one are you referring to?
hgu133plus2GO should be the inverse map for hgu133plus2GO2PROBE.  

> "208729_x_at" is on the hgu133plus2 chip, but GO and GO2ALLPROBES don't
> map it to the same GO ID.

Can you be more specific?  Which env in the GO package are you talking
about.  Note that GO2ALLPROBES does not map to GO ids, it maps _from_
GO ids.

You can ask which GO ids have the 208729_x_at annotation using

If you then grep through hgu133plus2GO2ALLPROBES for GO ids that have
208729_x_at in their probe vector, then you should find more GO ids
because you are picking up parent terms that don't have the specific
annotation.  However, all the ids you found in hgu133plus2GO should

Clear as mud? :-)

> Is there something wrong here or am I just missing something?  If
> different, which is the most "reliable" mapping?  I'm concerned because
> I went through to validate GO IDs I had gotten from the GOHyperG
> function (a total of 314), and 117 of those I could not map back to my
> significant probe list using the hgu133plus2GO annotation.  I noticed by
> looking at the GOHyperG function that it uses information from
> Any help/enlightenment is much appreciated.
> PS - using R 2.2.1 with hgu133plus2 1.10.0

PS: sessionInfo() would be a better way to report that.  Then we would
also know your version of the GO package, for example.

+ seth

More information about the Bioconductor mailing list