[BioC] AffyBatch: Replacing probe set IDs with Genebank Accesion numbers?

Kasper Daniel Hansen k.hansen at biostat.ku.dk
Thu Mar 3 19:44:14 CET 2005

On Thu, Mar 03, 2005 at 11:57:16AM +0100, JSPC (Jeppe Skytte Spicker) wrote:
> How do I get from the probe set IDs in the AffyBatch at exprs to e.g.
> Genebank acc. numbers?
> I would like to be able to make the replacement using the .GIN file
> because I am also looking at custom affy chips.
> Thank you in advance.

Hi Jeppe

Everything is stored in so called annotation packages. The basic 
identifier is the probeset id, so everything maps from probeset id to 
various databases.

If you are using hgu133a you load the metadata by
  > library(hgu133a)

No yiu have a number of different envirnoment available, which 
corresponds to maps between probeset id and various databases. You can 
get an overview by
  > library(help = hgu133a)

I am not an expert on these databases, but to me it dseems like you 
need the hgu133aACCNUM, you can read about it by
  > ? hgu133aACCNUM

So how do you use it? Well, you take your favorite affy id, eg. by
  > id <- geneNames(data[i,])
(the i'the row)
And then you do
  > get(id, envir = hgu133aACCNUM)
This gives you the value.

In case you need to look up several id's at the same time (which you 
usually do), you do
  > idmany <- geneNames(data[80:100,]) # caharcter vector
  > mget(idmany, envir = hgu133aACCNUM)

These environmnets basically function like hash tables in perl, if you 
are familiar with those.

If you need to build customn annotation, you need to use the AnnBuilder 
package. This is a big project, but certainly doable (I think :) Prepare 
to spend a lot of time reading the documentation.


Kasper Daniel Hansen, Research Assistant
Department of Biostatistics, University of Copenhagen

More information about the Bioconductor mailing list