[BioC] GEOquery and different types of GPL annotation files

Sean Davis sdavis2 at mail.nih.gov
Fri Jan 20 16:43:40 CET 2006

On 1/20/06 9:27 AM, "Peter" <bioconductor-mailinglist at maubp.freeserve.co.uk>

> Do anyone know what the difference is between these two GEO GPL files?
> GPL199.annot (540kb)
> GPL199.soft (2166kb)

The Annotation Soft files are built by GEO staff when they build a GEO
dataset.  They use whatever public identifier they can in the submitted GPL
to do lookups on their own of what the features on the array represent.
They are NOT available for every GPL, only those that are attached to a GDS.
They do not necessarily agree with the original submitted GPL.  They are not
currently handled by GEOquery.  However, as you noted in another post,
Peter, the original GPLs as submitted by users are often larger than those
built by GEO, so I haven't found a strong reason to work with the Annotation
Soft files.  In fact, I typically use the GPL information only for lookup of
some primary key  (genbank accession, affy id, or something like that) and
then build the annotation myself (or use a bioconductor annotation package),
as the methods used to generate annotation can be quite varied and the time
since last update (in the case of GPLs, never updated) is important.

Hope that helps clarify things a bit.


More information about the Bioconductor mailing list