[BioC] Fwd: GO terms: Annotation for HumanMethylation450

Jinyan Huang jhuang at hsph.harvard.edu
Wed Apr 3 20:07:49 CEST 2013


Marc,

When I update my R to 2.15.2, I still have the error.

R

R version 2.15.2 (2012-10-26) -- "Trick or Treat"
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> ids = c( "GO:0008150", "GO:0001869")
> result = select(GO.db, keys =ids, cols=c("DEFINITION","TERM"))
Error: could not find function "select"
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

On Wed, Apr 3, 2013 at 1:40 PM, Marc Carlson <mcarlson at fhcrc.org> wrote:
> Hi Jinyan,
>
> The code I showed you before will get you all the GO TERMS and their
> DESCRIPTIONS into a single data frame (without using too much RAM):
>
> library(GO.db)
> k = keys(GOTERM)  ## k is now all the GOIDs that we actually have Terms
> for.
> ## If you use another source of GOIDs, you might want to call unique()
> on that 1st.
> ## In order to save time.
> ## Then just call select like I showed you before
> result = select(GO.db, keys =k, cols=c("DEFINITION","TERM"))
>
> ## Then you can use merge() to attach that onto your gene IDs later on.
>
> I hope this helps,
>
>
>    Marc
>
>
>
> On 04/03/2013 08:28 AM, Tim Triche, Jr. wrote:
>> Probably so. I will look into it. Thanks for the report
>>
>> --t
>>
>> On Apr 3, 2013, at 8:21 AM, Jinyan Huang <jhuang at hsph.harvard.edu> wrote:
>>
>>> Are there any others efficient way to do this? I just thought there
>>> are some problem in my code.
>>>
>>> On Wed, Apr 3, 2013 at 11:14 AM, Tim Triche, Jr. <tim.triche at gmail.com> wrote:
>>>> Buy more RAM :-)
>>>>
>>>> --t
>>>>
>>>> On Apr 3, 2013, at 6:59 AM, Jinyan Huang <jhuang at hsph.harvard.edu> wrote:
>>>>
>>>>> When I want to get all GO terms on IlluminaHumanMethylation450k. There
>>>>> is a memory problem. It uses more than 10G memory.
>>>>>
>>>>> GOids <- lapply(res2, function(x) unlist(lapply(x, function(y) y$GOID)))
>>>>> GOterms <- lapply(GOids, function(x) mget(x, GOTERM, ifnotfound=NA))
>>>>> Error: memory exhausted (limit reached?)
>>>>> Execution halted
>>>>>
>>>>>
>>>>> --------------------------------------Get_all_GO.R----------------------------------------------
>>>>>
>>>>> library(IlluminaHumanMethylation450k.db)
>>>>> ## allow both singly- and multiply-mapped probes (e.g. for SYMBOL)
>>>>> IlluminaHumanMethylation450kGOall
>>>>> <-toggleProbes(IlluminaHumanMethylation450kGO,'all')
>>>>> ## now let's look at the differences that result from toggleProbes()
>>>>> mapped_probes_toggled <- mappedkeys(IlluminaHumanMethylation450kGOall)
>>>>> res <- mget(mapped_probes_toggled, IlluminaHumanMethylation450kGOall,
>>>>> ifnotfound=NA)
>>>>> res2 <- lapply(res, function(x) x[sapply(x, function(y) y['Evidence']!='IEA')])
>>>>> ## fetch the GOIDs from the unencumbered toggled map, to get terms for them
>>>>> library(GO.db)
>>>>> GOids <- lapply(res2, function(x) unlist(lapply(x, function(y) y$GOID)))
>>>>> GOterms <- lapply(GOids, function(x) mget(x, GOTERM, ifnotfound=NA))
>>>>> d<-lapply(GOterms,function(x)do.call(rbind,lapply(x,function(y)data.frame(y at Term,y at GOID,y at Ontology))))
>>>>> df<-do.call(rbind,d)
>>>>> len <- sapply(d,function(x)length(x[,1]))
>>>>> probes <- rep(names(d),len)
>>>>> df.out<-data.frame(probes=probes,df)
>>>>> names(df.out)<-c("probe","GoTerm","GOID","GOCategory")
>>>>> write.table(df.out,"GO_all.txt",quote=F,row.names=F,col.names=T,sep="\t")
>>>>>
>>>>> ----------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> On Tue, Apr 2, 2013 at 7:29 PM, Tim Triche, Jr. <tim.triche at gmail.com> wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> Not sure how I managed not to cc: the list on this initially. Here's some GO.db code with a sort of "moral" to it ;-)
>>>>>>
>>>>>> --t
>>>>>>
>>>>>> Begin forwarded message:
>>>>>>
>>>>>> library(IlluminaHumanMethylation450k.db)
>>>>>>
>>>>>> ## allow both singly- and multiply-mapped probes (e.g. for SYMBOL) IlluminaHumanMethylation450kGOall <-toggleProbes(IlluminaHumanMethylation450kGO, 'all')
>>>>>>
>>>>>> ## now let's look at the differences that result from toggleProbes()
>>>>>> mapped_probes_default <- mappedkeys(IlluminaHumanMethylation450kGO)
>>>>>> mapped_probes_toggled <- mappedkeys(IlluminaHumanMethylation450kGOall)
>>>>>> multimapped <- setdiff( mapped_probes_toggled, mapped_probes_default )
>>>>>>
>>>>>> res0 <- mget(head(multimapped), IlluminaHumanMethylation450kGO, ifnotfound=NA)
>>>>>> res <- mget(head(multimapped), IlluminaHumanMethylation450kGOall, ifnotfound=NA)
>>>>>>
>>>>>> ## fetch the GOIDs from the unencumbered toggled map, to get terms for them
>>>>>>
>>>>>> library(GO.db)
>>>>>> GOids <- lapply(res, function(x) unlist(lapply(x, function(y) y$GOID)))
>>>>>> GOterms <- lapply(GOids, function(x) mget(x, GOTERM, ifnotfound=NA))
>>>>>> head(GOterms)
>>>>>>
>>>>>>
>>>>>>> I'll add this to the docs (next release)
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> --t
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Mar 29, 2013 at 11:24 AM, Fabrice Tourre <fabrice.ciup at gmail.com> wrote:
>>>>>>>> Tim,
>>>>>>>>
>>>>>>>> Thank you very much for your reply.
>>>>>>>> I have a list of probe list.
>>>>>>>> Do you a example script for me to get the GO terms, instead of GO ID?
>>>>>>>>
>>>>>>>> The Documentation is not very clear for this.
>>>>>>>> http://www.bioconductor.org/packages/2.11/data/annotation/html/IlluminaHumanMethylation450k.db.html
>>>>>>>>
>>>>>>>> On Fri, Mar 29, 2013 at 12:29 PM, Tim Triche, Jr. <tim.triche at gmail.com> wrote:
>>>>>>>>> Oddly enough, the paper from UCSD with Illumina's folks on it (*) used the
>>>>>>>>> IlluminaHumanMethylation450k.db package (which I am currently rebuilding to
>>>>>>>>> have a startup message about toggleProbes()) to annotate both CpG islands
>>>>>>>>> and GO terms.
>>>>>>>>>
>>>>>>>>> (*)
>>>>>>>>> http://idekerlab.ucsd.edu/publications/Documents/Hannum_MolCell_2012.pdf
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Mar 29, 2013 at 8:49 AM, Fabrice Tourre <fabrice.ciup at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> Dear list,
>>>>>>>>>>
>>>>>>>>>> In the annotation file of Infinium HumanMethylation450 BeadChip,
>>>>>>>>>>
>>>>>>>>>> http://support.illumina.com/documents/MyIllumina/b78d361a-def5-4adb-ab38-e8990625f053/HumanMethylation450_15017482_v.1.2.csv
>>>>>>>>>>
>>>>>>>>>> for each probe set, they do not have annotation for GO terms, pathways.
>>>>>>>>>>
>>>>>>>>>> As they have done in the annotation file: HG-U133_Plus_2.na32.annot.csv.
>>>>>>>>>>
>>>>>>>>>> Is there some bioconductor package to annotated the Infinium
>>>>>>>>>> HumanMethylation450 probes? Given a probe, feed back the GO terms and
>>>>>>>>>> pathways.
>>>>>>>>>>
>>>>>>>>>> Thank you very much in advance.
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Bioconductor mailing list
>>>>>>>>>> Bioconductor at r-project.org
>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>>>>>> Search the archives:
>>>>>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> A model is a lie that helps you see the truth.
>>>>>>>>>
>>>>>>>>> Howard Skipper
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> A model is a lie that helps you see the truth.
>>>>>>>
>>>>>>> Howard Skipper
>>>>>>        [[alternative HTML version deleted]]
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>>
>>>>> --
>>>>> Best wishes,
>>>>>
>>>>> Jinyan HUANG
>>>
>>>
>>> --
>>> Best wishes,
>>>
>>> Jinyan HUANG
>



-- 
Best wishes,

Jinyan HUANG



More information about the Bioconductor mailing list