[BioC] GO.db redundance

Fri May 15 19:11:10 CEST 2009

Hi Giacomo,

Lets try to keep this on list so that others can benefit from your
questions.

So the reason why you cannot coerce this into a data.frame is because
what you have in the GO_CC_description object is actually a list of
"GOTerms "objects.  The error message R is giving you is trying to tell
you that it does not know how to cast that into a data.frame.  You can
see this for yourself if you use the str() function like this:

str(GO_CC_description)

So I think you can see that if you want to get the individual
descriptions out of there you are going to have to be a bit more
specific.  So to just continue your example:

##You started by just getting the GOTERM info for the 1st set of elements
GO_CC_description=mget(clnList[[1]],GOTERM,ifnotfound=NA)
##And as we discussed this gives you a list of GOTerms objects back

##So if you really want a data frame, then you can always
## just break the parts of this object out (the parts you want) and
##then reassemble those into a data frame like this:

##Get the terms out
GO_CC_terms = sapply(GO_CC_description, function(x) x at Term)
##Lets combine those with the GO IDs
GO_CC_IDs = clnList[[1]]

df = data.frame(cbind(GO_CC_IDs,GO_CC_terms))

df

Hope this helps.

  Marc

giacomo.tuana at unimib.it wrote:
> Hi Marc,
>
> thanks a lot for your suggestions. Now I've another kind of problem. I
> want to coerce GO terms found into a data.frame or list for printing
> out a table file. Or to create it by use of some extract function for
> GO terms data type. But I How can I do?
>
> I used your previous code:
>
> library("mgu74av2.db")
> library("GO.db")
>
> ##Get the IDs you wanted
> all_probes_mgu <- ls(mgu74av2ENTREZID)
> ##Get the GO IDs for these IDs
> GOIDs = mget(all_probes_mgu, mgu74av2GO, ifnotfound=NA)
>
> ##You also wanted to remove things that were not part of the
> ##"CC" ontology.  There is a good way to do this in ever so convenient
> ##annotate package...
> ##So for example, we can make use of the getOntology method like this:
> library("annotate")
> clnList = lapply(GOIDs, getOntology, "CC")
>
> so I added this lines:
> GO_CC_description=mget(clnList[[1]],GOTERM,ifnotfound=NA)
> GO_CC_description_df=as.data.frame(GO_CC_description)
> Error in as.data.frame.default(x[[i]], optional = TRUE) :
> cannot coerce class "GOTerms" into a data.frame
>
>
>
> Best Regards
>
>
> Giacomo
>
>
>
>
> -- 
>
>
> Dr. Giacomo Tuana Franguel
>
> Genopolis Consortium
> University of Milano-Bicocca
> Dept. of Biotechnology and Bioscience/ U4
> Piazza della Scienza 4 20126 Milano, Italy
> Tel +39 02 6448 3530
> Fax +39 02 4074 6210
>
>
> On Mon, 11 May 2009 11:49:03 -0700
>  mcarlson at fhcrc.org wrote:
> > Hi Giacomo,
> >
> > The problem isn't with the databases or the annotation
> >packages, but
> > with how you are using toTable().  I would not use
> >toTable() like that  since this is not what it was
> >designed to do.  Instead, I would  recommend an approach
> >more like this:
> >
> > library("mgu74av2.db")
> > library("GO.db")
> >
> > ##Get the IDs you wanted
> > all_probes_mgu <- ls(mgu74av2ENTREZID)
> > ##Get the GO IDs for these IDs
> > GOIDs = mget(all_probes_mgu, mgu74av2GO, ifnotfound=NA)
> >
> > ##You also wanted to remove things that were not part of
> >the
> > ##"CC" ontology.  There is a good way to do this in ever
> >so convenient
> > ##annotate package...
> > ##So for example, we can make use of the getOntology
> >method like this:
> > library("annotate")
> > clnList = lapply(GOIDs, getOntology, "CC")
> >
> > ##Finally if we want to get more details for each of
> >these GOIDs, we
> > ##can use the GOTERM mapping in the usual way:
> >
> > ##So for the probe you used in your example:
> > clnList[1]
> > ##You can look up the details from the GOTERM table like
> >this:
> > mget(clnList[[1]],GOTERM,ifnotfound=NA)
> >
> >
> > You weren't super clear about what exactly you were
> >trying to do, so I  hope that this answers your
> >questions.  If not, please let us know.
> >
> >
> >    Marc
> >
> >
> >
> >
> >
> > Quoting giacomo.tuana at unimib.it:
> >
> >>
> >>    Hi,
> >>    I found a redundance in GO annotation database trying
> >>to build a global
> >>    table of annotation with probeset_ID, DB
> >>crossreferences (entrez_ID, gene
> >>    name....) and GO annotation. For a single probe there
> >>are more GO  
> >>    terms very
> >>    similar (use of synonymous) or equal (different
> >>punctuation) in GO term
> >>    definitions; I think this could be a problem for
> >>functional    
> >> annotation. Can
> >>    someone suggest me how to deal with this situation?
> >>Or different way to
> >>    build a global table of annotation?
> >>    Here the code I used for CC category, example with
> >>"100001_at"    
> >> probeset ID:
> >>    library("mgu74av2.db")
> >>    library("GO.db")
> >>    go_mgu<-toTable(mgu74av2GO)
> >>    go_term_description<-toTable(GOTERM)
> >>    all_probes_mgu <- ls(mgu74av2ENTREZID)
> >>    go_mgu_descr<-merge(go_mgu[,1:3],go_term_description,by.x=2,by.y=1)
> >>    go_mgu_cc<-go_mgu_descr[which((go_mgu_descr[,6])=="CC"),]
> >>    go_mgu_cc[which((go_mgu_cc[,2]=="100001_at")),]
> >>    Thanks
> >>    Giacomo
> >>    --
> >>    Dr. Giacomo Tuana Franguel
> >>    Genopolis Consortium
> >>    University of Milano-Bicocca
> >>    Dept. of Biotechnology and Bioscience/ U4
> >>    Piazza della Scienza 4 20126 Milano, Italy
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:    
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >
> >
> >
> >
> >
>