[BioC] Affymetrix mouse 430_2 array - annotation problem

Rao,Xiayu XRao at mdanderson.org
Tue Jul 22 18:15:44 CEST 2014


Hi, Jim

Thanks a lot for your previous helps! I now have the annotation problems.

I used select to annotate as you suggested me to do.
> fData(eset) <- select(mouse4302.db, featureNames(eset),c("SYMBOL","GENENAME","ENTREZID"))
Warning message:
In .generateExtraRows(tab, keys, jointype) :
  'select' resulted in 1:many mapping between keys and return rows

(1) Regarding the warning message, I read in the forum that you suggested to remove the duplicates or collapse them to comma-separated vectors and then incorporate. So for my condition, should I do 
fData(eset) <- fData(eset)[!duplicated(fData(eset)$PROBEID),]
OR
eset2 <- tapply(fData(eset)$ENTREZID, fData(eset)[,1], paste, collapse = ",")
OR
Can I just ignore the warning and do nothing, as I want to leave everything there as generated by select()??


(2) It is strange to see that for the topTable, the row names and the first column (PROBEID) do not match. As you can see below, 1436717_x_at and 1435289_at are different for the 1st row. Why?         
> topTableF(fit2, adjust="BH")
                                               PROBEID         SYMBOL                                                                  GENENAME             ENTREZID             M129.15-M129.13
1436717_x_at              1435289_at          Engase                     endo-beta-N-acetylglucosaminidase         217364                              -1.946299
1436823_x_at              1435390_at          Eri2                                                                exoribonuclease 2            71151                              -1.975441

                                 M129.17-M129.15   AveExpr         F      P.Value    adj.P.Val
1436717_x_at     -6.32963614               11.009177 3145.6769 8.379499e-17 3.499204e-12
1436823_x_at     -6.46817108               10.999412 2832.7874 1.551719e-16 3.499204e-12


Thanks,
Xiayu





-----Original Message-----
From: James W. MacDonald [mailto:jmacdon at uw.edu] 
Sent: Monday, July 21, 2014 11:43 AM
To: Rao,Xiayu; 'bioconductor at r-project.org'
Subject: Re: [BioC] Affymetrix mouse 430_2 array - gene expression and annotation

Hi Xiayu,

> 2) and add annotation thereafter? For the transcript level annotation, 
> I have used the following code before. But not sure for this mouse 
> array, is there a similar way or similar transcript database to do 
> such? I know there is a database called mouse4302.db.
> ID <- featureNames(geneCore2) Symbol <-
> getSYMBOL(ID,"hugene10sttranscriptcluster.db") fData(geneCore2) <-
> data.frame(ID=ID,Symbol=Symbol)

This is an old way of annotating things, and has been superceded (for like five years now) by a more compact API:

fData(geneCore2) <- select(mouse4302.db, featureNames(geneCore2), "SYMBOL")

And note you can add in other more useful things like the Gene ID as well (while biologists tend to like HUGO symbols, they are not, as advertized, actually unique things, so you always run the risk of thinking you have <a gene you care about> when in fact you are looking at the data for <some other gene with the same HUGO symbol>).

fData(geneCore2) <- select(mouse4302.db, featureNames(geneCore2),
c("SYMBOL","GENENAME","ENTREZID"))


Best,

Jim



More information about the Bioconductor mailing list