[BioC] Annotation.db: how automatically call a mapping?

Tue Jun 30 23:01:27 CEST 2009

Hi again,

Some remarks/questions:

First of all, thanks for all your help!

Meanwhile I got limma2annaffy to work. Nice HTML output! However some
minor things: 
- I experienced that the library 'annaffy' is not automatically loaded.
- how to get the scientific notation for the p-values (e.g. 1.24E-12;
for all 30 genes it is now "0").
- in the output file, the header 'Fold Change' actually reflects
'log2FC'.

Regarding the function probes2table, how to access the results [...html
= FALSE, filename = "output")]? 
Is it (=output) automatically saved? If so, where?; it is not in my
working directory.

Thanks,
Guido

> limma2annaffy(eset, fit2, design, cont.matrix, "moe430a", adjust =
"fdr", 
anncols = aaf.handler()[c(1:3, 6:7, 9:12)], number = 30, pfilt = NULL, 
fldfilt= NULL,tstat = TRUE, pval = TRUE, FC = TRUE, expression = TRUE, 
html = TRUE, text = FALSE, save = FALSE, addname = NULL, interactive =
TRUE)

You are going to output 3 tables, 
With this many genes in each:
 30 30 30
Do you want to accept or change these values? [ a/c ]a
Error in as.vector(x) : could not find function "aaf.handler"
> library(annaffy)
> limma2annaffy(eset, fit2, design, cont.matrix, "moe430a", adjust =
"fdr", 
anncols = aaf.handler()[c(1:3, 6:7, 9:12)], number = 30, pfilt = NULL, 
fldfilt= NULL,tstat = TRUE, pval = TRUE, FC = TRUE, expression = TRUE, 
html = TRUE, text = FALSE, save = FALSE, addname = NULL, interactive =
TRUE)

You are going to output 3 tables, 
With this many genes in each:
 30 30 30
Do you want to accept or change these values? [ a/c ]a

> -----Original Message-----
> From: James W. MacDonald [mailto:jmacdon at med.umich.edu] 
> Sent: 30 June 2009 22:29
> To: Hooiveld, Guido
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Annotation.db: how automatically call a mapping?
> 
> 
> 
> Hooiveld, Guido wrote:
> > Hi Jim,
> > Since your suggestion looks indeed easy and seems to provide 
> > everything I would like to have, a gave it a try (I'll have 
> a further 
> > look at Martin's comments/suggestions later).
> > However, it seems that 'probes2table' cannot properly 
> handle multiple 
> > contrast defined in limma:
> > 
> >> library(affycoretools)
> > Loading required package: GO.db
> > Loading required package: DBI
> > Loading required package: KEGG.db
> >> probes2table(eset, featureNames(eset), annotation(eset),
> > list("p-value"= fit2$p.value, "mean" = fit2$Amean), html = FALSE, 
> > filename = "output") Loading required package: moe430a.db Error in 
> > aafTable(items = otherdata) :
> >   All columns must be of equal length
> >> length(fit2$p.value)
> > [1] 68070
> >> length(fit2$Amean)
> > [1] 22690
> > 
> > I analyzed three contrasts in limma, and 68070/3 is 22690, which 
> > exactly equals the number of probesets on the Affy MOE430A 
> array. This 
> > thus explains the error.
> > 
> > Question: can this easily be solved? Can limma2annaffy 
> handle multiple 
> > contrasts? (at the moment I am not familiar with 
> affycoretools at all).
> 
> limma2annaffy(eset, fit2, <designmatrixname>, 
> <contrastsmatrixname>, annotation(eset), pfilt=1, html = 
> FALSE, interactive = FALSE)
> 
> should give you text tables for each contrast. I would 
> normally use a reasonable p-value and possibly a fold filter 
> to reduce the output to the reporters of interest, and use 
> html = TRUE because my end users in general like that better 
> than the text. However, for all reporters on this chip the 
> HTML table is too huge to be of use.
> 
> > 
> > Thanks,
> > Guido
> > 
> >  
> > 
> >> -----Original Message-----
> >> From: bioconductor-bounces at stat.math.ethz.ch
> >> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf 
> Of James W. 
> >> MacDonald
> >> Sent: 30 June 2009 19:44
> >> To: Hooiveld, Guido
> >> Cc: bioconductor at stat.math.ethz.ch
> >> Subject: Re: [BioC] Annotation.db: how automatically call 
> a mapping?
> >>
> >> Hi Guido,
> >>
> >> Hooiveld, Guido wrote:
> >>> Hi Martin,
> >>>
> >>> Indeed, another useful, straigh-forward possibility for mapping. 
> >>> However, I am now facing the problem of properly combining the 
> >>> annotation info with the expression data. This is what I
> >> would like to
> >>> do:
> >>>
> >>>> Tab_data <- exprs(eset[probeids])
> >>>> Tab_data <- cbind(Tab_data, fit2$Amean) # to add average
> >> expression
> >>>> of
> >>> LIMMA output
> >>>> Tab_data <- cbind(Tab_data, fit2$p.value) # to add 
> p-value of LIMMA
> >>> output
> >>> etc.
> >>>
> >>> This al goes fine, however adding the annotation info
> >> 'mixes-up' the
> >>> content of Tab_data; the annotation data replaces the first
> >> column of
> >>> Tab_data, and the content of all cells is replaced by 'null'. I 
> >>> suspect it has something to do with the type of object I
> >> would like to
> >>> merge, but I am not sure.
> >>>
> >>>> map.entrez <- getAnnMap("ENTREZID", annotation(eset))
> >> map.entrez <-
> >>>> as.list(map.entrez[probeids])
> >> This sort of thing is going to get really difficult to do by hand 
> >> when you get to things that have a one-to-many 
> relationship. And you 
> >> are already duplicating existing efforts with what you 
> have done so 
> >> far.
> >>
> >> If you want to combine annotation data with results data, 
> you really 
> >> want to be using the annaffy package which does lots of 
> these things 
> >> seamlessly. And if you want things to be a bit easier, you could 
> >> consider using the affycoretools package as well, which 
> for the most 
> >> part uses annaffy to create output.
> >>
> >> You can do what you appear to want in one line:
> >>
> >> probes2table(eset, featureNames(eset), annotation(eset), 
> >> list("p-value"= fit2$p.value, "mean" = fit2$Amean), html = FALSE, 
> >> filename = "output")
> >>
> >> You might need to play around with the 'anncols' argument 
> to get what 
> >> annotation data you might want.
> >>
> >> If you want output specific to the contrasts you have fit, see 
> >> ?limma2annaffy.
> >>
> >> Best,
> >>
> >> Jim
> >>
> >>
> >>
> >>>
> >>>> Tab_data <- cbind(Tab_data, map.entrez)
> >>>   ^ in R this seems to work, but when saved as .txt the 
> content of 
> >>> Tab_data is completely mixed up. Before 'adding' map.entrez
> >> Tab_dat is
> >>> OK.
> >>>
> >>>
> >>>> write.table(cbind(rownames(Tab_data2), Tab_data2),
> >>> file="test_1234.txt", sep="\t", col.names=TRUE, row.names=FALSE)
> >>>
> >>>> class(Tab_data)
> >>> [1] "matrix"
> >>>> class(map.entrez)
> >>> [1] "list"
> >>>
> >>>
> >>> Do you, or someone elsr, have a suggestion how to properly
> >> link these
> >>> two types of data?
> >>> Thanks again,
> >>> Guido
> >>>
> >>>
> >>>
> >>>  
> >>>
> >>>> -----Original Message-----
> >>>> From: bioconductor-bounces at stat.math.ethz.ch
> >>>> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf
> >> Of Martin
> >>>> Morgan
> >>>> Sent: 30 June 2009 00:00
> >>>> To: Hooiveld, Guido
> >>>> Cc: bioconductor at stat.math.ethz.ch
> >>>> Subject: Re: [BioC] Annotation.db: how automatically call
> >> a mapping?
> >>>> Hooiveld, Guido wrote:
> >>>>> Hi,
> >>>>>  
> >>>>> I am facing a problem i cannot solve myselves, despite
> >> everything i
> >>>>> read/know. But i assume the solution is easy for the more
> >>>> knowledgable
> >>>>> folks in BioC/R...
> >>>>>  
> >>>>> This does work:
> >>>>>> library(moe430a.db)
> >>>>>> xxyy <- moe430aSYMBOL
> >>>>>> xxyy
> >>>>> SYMBOL map for chip moe430a (object of class "AnnDbBimap")
> >>>>>  
> >>>>> However, for this to work you need to know the array type
> >>>> of the data
> >>>>> that is analyzed.
> >>>>>  
> >>>>>  
> >>>>> Now i would like to automatically extract the (e.g.)
> >> SYMBOL mapping
> >>>>> from an annotation.db, thus by retrieving the array type
> >>>> from the eset.
> >>>>>  
> >>>>>> library(affy)
> >>>>>> eset <- rma(data)
> >>>>>> probeids <- featureNames(eset)
> >>>>>> annotation(eset)
> >>>>> [1] "moe430a"
> >>>>>  
> >>>>> But how can i use this info to properly call the SYMBOL mapping?
> >>>> Hi Guido --
> >>>>
> >>>> to get the appropriate map
> >>>>
> >>>>   library(annotate)
> >>>>   map = getAnnMap("SYMBOL", annotation(eset))
> >>>>
> >>>> to select just the relevant probes
> >>>>
> >>>>   map[probeids]
> >>>>
> >>>> toTable(map[probeids]) or as.list(map[probeids]) might 
> be the next 
> >>>> step in the work flow.
> >>>>
> >>>> Martin
> >>>>
> >>>>>  
> >>>>> I tried this:
> >>>>>> arraytype <- annotation(eset)
> >>>>>> arraytype <- paste(arraytype, "db", sep = ".") arraytype
> >>>>> [1] "moe430a.db"
> >>>>>> arraytype <- paste("package", arraytype, sep = ":") gh <-
> >>>>>> ls(arraytype) gh
> >>>>>  [1] "moe430a"              "moe430a_dbconn"       
> >> "moe430a_dbfile"
> >>>>> "moe430a_dbInfo"       "moe430a_dbschema"     "moe430aACCNUM"
> >>>>> "moe430aALIAS2PROBE"   "moe430aCHR"           
> "moe430aCHRLENGTHS"
> >>>>> "moe430aCHRLOC"       
> >>>>> [11] "moe430aCHRLOCEND"     "moe430aENSEMBL"
> >>>>> "moe430aENSEMBL2PROBE" "moe430aENTREZID"      "moe430aENZYME"
> >>>>> "moe430aENZYME2PROBE"  "moe430aGENENAME"      "moe430aGO"
> >>>>> "moe430aGO2ALLPROBES"  "moe430aGO2PROBE"     
> >>>>> [21] "moe430aMAP"           "moe430aMAPCOUNTS"     "moe430aMGI"
> >>>>> "moe430aMGI2PROBE"     "moe430aORGANISM"      "moe430aPATH"
> >>>>> "moe430aPATH2PROBE"    "moe430aPFAM"          "moe430aPMID"
> >>>>> "moe430aPMID2PROBE"   
> >>>>> [31] "moe430aPROSITE"       "moe430aREFSEQ"        
> "moe430aSYMBOL"
> >>>>> "moe430aUNIGENE"       "moe430aUNIPROT"
> >>>>>  
> >>>>>> gh[33]
> >>>>> [1] "moe430aSYMBOL"
> >>>>>> symbols <- mget(probeids, gh[33])
> >>>>> Error in mget(probeids, gh[33]) : second argument must be an 
> >>>>> environment
> >>>>>  
> >>>>> This also doesn't work:
> >>>>>> symbols <- mget(probeids, envir=gh[33])
> >>>>> Error in mget(probeids, envir = gh[33]) : 
> >>>>>   second argument must be an environment
> >>>>>  
> >>>>> My approach thus is the wrong approach to automatically extract 
> >>>>> mappings from a annotation.db.
> >>>>> Since i don't know about any other possibility, i would
> >>>> appreciate if
> >>>>> someone could point me to a working solution.
> >>>>>  
> >>>>> Thanks,
> >>>>> Guido
> >>>>>  
> >>>>>
> >>>>> ------------------------------------------------
> >>>>> Guido Hooiveld, PhD
> >>>>> Nutrition, Metabolism & Genomics Group Division of Human
> >> Nutrition
> >>>>> Wageningen University Biotechnion, Bomenweg 2
> >>>>> NL-6703 HD Wageningen
> >>>>> the Netherlands
> >>>>> tel: (+)31 317 485788
> >>>>> fax: (+)31 317 483342 
> >>>>> internet:   http://nutrigene.4t.com <http://nutrigene.4t.com/>  
> >>>>> email:      guido.hooiveld at wur.nl 
> >>>>>
> >>>>>
> >>>>>
> >>>>> 	[[alternative HTML version deleted]]
> >>>>>
> >>>>> _______________________________________________
> >>>>> Bioconductor mailing list
> >>>>> Bioconductor at stat.math.ethz.ch
> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>>> Search the archives: 
> >>>>> 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>> _______________________________________________
> >>>> Bioconductor mailing list
> >>>> Bioconductor at stat.math.ethz.ch
> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>> Search the archives: 
> >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>>>
> >>>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives: 
> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> --
> >> James W. MacDonald, M.S.
> >> Biostatistician
> >> Douglas Lab
> >> University of Michigan
> >> Department of Human Genetics
> >> 5912 Buhl
> >> 1241 E. Catherine St.
> >> Ann Arbor MI 48109-5618
> >> 734-615-7826
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: 
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >>
> > 
> 
> --
> James W. MacDonald, M.S.
> Biostatistician
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> 734-615-7826
> 
>