[BioC] Fwd: [R] Question to use R plot GO pie chart

Waverley @ Palo Alto waverley.paloalto at gmail.com
Fri Dec 25 03:12:33 CET 2009


Can someone help?

Thanks a lot in advance.


---------- Forwarded message ----------
From: Martin Morgan <mtmorgan at fhcrc.org>
Date: Thu, Dec 24, 2009 at 5:35 PM
Subject: Re: [R] Question to use R plot GO pie chart
To: "Waverley @ Palo Alto" <waverley.paloalto at gmail.com>
Cc: r-help at r-project.org


Waverley @ Palo Alto wrote:
> Hi,
>
> I have a list of IPI gene IDs.  I want to find out whether there is a
> package which can map the gene ontology to these IPIs, and plot the
> pie chart to demonstrate the molecular function distributions.
>
> The input is like the following gene IPI IDs:
> IPI:IPI00008860.1|SWISS-PROT:Q9BXJ4-1|TREMBL:Q542Y2|ENSEMBL:ENSP00000231338;EN
> IPI:IPI00019922.5|SWISS-PROT:Q8N0Y2-1|TREMBL:Q53F81|ENSEMBL:ENSP00000338860;ENSP00000375594|REFSEQ:NP_060807|H-INV:HIT000028861|VEGA:OTTHUMP00000078377
> Tax_Id=9606 Gene_Symbol=ZN
> IPI:IPI00647423.2|SWISS-PROT:Q8N819-1|REFSEQ:NP_001073870|VEGA:OTTHUMP00000076687
> Tax_Id=9606 Gene_Symbol=FLJ40125 Isoform 1 of
> IPI:IPI00219000.2|SWISS-PROT:P27658|TREMBL:Q53XI6|ENSEMBL:ENSP00000261037|REFS
> IPI:IPI00291878.4|SWISS-PROT:P35247|ENSEMBL:ENSP00000361366|REFSEQ:NP_003010|H-INV:HIT000039466|VEGA:OTTHUMP00000019944
> IPI:IPI00013945.1|SWISS-PROT:P07911-1|TREMBL:Q8NHW8|ENSEMBL:ENSP00000306279|RE
> IPI:IPI00000634.1|SWISS-PROT:Q16204|TREMBL:Q6GSG7|ENSEMBL:ENSP00000263102|REFS
>
> I want to plot the pie chart of these gene distribution in the GO
> molecular function as a pie chart.  An example is shown in the
> following link http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y
>
>
> Can some one help?

Not sure that it is this easy. The IPI are protein identifiers. GO
categories classify genes. Neither the mapping from protein to gene or
gene to GO category is 1:1. GO categories form a hierarchy. So there are
significant decisions to be made in representing IPI identifiers in a
pie chart of GO terms.

Bioconductor maintains 'org' and 'GO' database packages that provide the
necessary link between IPI protein ids and GO gene ontology categories,
via ENTREZ gene ids. Code might look like

 ## once only, to install packages
 source('http://bioconductor.org/biocLite.R')
 biocLite('org.Hs.eg.db', 'GO.db')

 ## from IPI to ENTREZ id, not 1:1
 library(org.Hs.eg.db)
 ipi2eg = revmap(eapply(org.Hs.eg.db, names)) ## NOT 1:1 map

 ## Assume ipiIds is, e.g., c('IPI00008860', 'IPI00019922')
 egIds = revmap(ipi2eg[ipiIds])

 ## get GO terms, also not 1:1
 goIds = eapply(org.Hs.egGO[names(egIds)], names)

You're still left with the problem of resolving multiple mappings and
the hierarchical relationship between GO terms. Asking on the
Bioconductor mailing list

 http://bioconductor.org/docs/mailList.html

is likely to lead to helpful answers.

Martin


> Thanks much in advance.
>
> Merry Christmas!!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list