[BioC] GO Analysis for Mouse Exon Array

Fri Feb 24 03:21:12 CET 2012

Hi Daisy,

you should be using mogene10sttranscriptcluster.db, given that you
summarized to the transcript level ( rma(..., target='core') ).

My strategy to get ENTREZID is shown with the script below.

HTH,

benilton

## preprocess with oligo
library(oligo)
mogeneFS <- read.celfiles(list.celfiles())
geneCore <- rma(mogeneFS, target='core')
psetsInGeneCore <- data.frame(probe_id=featureNames(geneCore))

## load the transcript annot pkg (note target='core' in rma)
library(mogene10sttranscriptcluster.db)
annot <- mogene10sttranscriptclusterENTREZID
psetsInAnnot <- mappedkeys(annot)

## get ENTREZID when available
ENTREZID <- as.data.frame(annot[psetsInAnnot])

## map the ENTREZID back to the 'probesets' in geneCore obj
psetENTREZID <- merge(psetsInGeneCore, ENTREZID, all.x=TRUE)

## ensure that psetENTREZID and geneCore are sorted in the same
## manner
idx <- match(featureNames(geneCore), psetENTREZID$probe_id)
psetENTREZID <- psetENTREZID[idx,]
rownames(psetENTREZID) <- NULL
rm(idx)

## get just a random sample
n <- nrow(psetENTREZID)
psetENTREZID[sort(sample(n, 10)),]