[BioC] PFAM gene enrichment test using GOstats and Category

Davis, Wade davisjwa at health.missouri.edu
Mon Jan 9 21:00:01 CET 2012


Henrik,
For years I've been using a general function I've to test for GO terms, PFAM, PATH etc but didn't employ the KEGGframe class directly like you do below. Maybe this code snippet from that function will help?
What is in each variable should be obvious from the name, I hope.


This is the call to my function that a lot of things not pertinent to you question, so I am not showing the whole function...

CatReports(organism="Human",
in_data=lumi.N.Q,
annotation_used="lumiHumanAll",
mytab=q_ac.cancer,
contrastName="DASL Cancer",
Category="PFAM",
table_p_cutoff=0.01,
fold_cut=1,
fdr_cut=0.5,
min_term_size=10,
var.filter.flag=FALSE,
xtable.sweave=TRUE,
sweave.table.label="table:Cancer_PFAMresults",
sweave.caption="Significantly Enriched PFAM Pathways: Cancer"
)



.....

if(Category=="PFAM"){
# PFAM ##############################################
paramsCond <- new("PFAMHyperGParams",
geneIds= geneUniverse,
universeGeneIds= entrezUniverse,
annotation=annotation_used,
pvalueCutoff= table_p_cutoff,
testDirection="over")
}

if(Category=="GOBP"){
# BP ##############################################
paramsCond <- new("GOHyperGParams",
geneIds= geneUniverse,
universeGeneIds= entrezUniverse,
annotation=annotation_used,
ontology="BP",
pvalueCutoff= table_p_cutoff,
conditional=T,
testDirection="over")
}

if(Category=="GOMF"){
# MF ##############################################
paramsCond <- new("GOHyperGParams",
geneIds= geneUniverse,
universeGeneIds= entrezUniverse,
annotation=annotation_used,
ontology="MF",
pvalueCutoff= table_p_cutoff,
conditional=T,
testDirection="over")
}

if(Category=="GOCC"){
# MF ##############################################
paramsCond <- new("GOHyperGParams",
geneIds= geneUniverse,
universeGeneIds= entrezUniverse,
annotation=annotation_used,
ontology="CC",
pvalueCutoff= table_p_cutoff,
conditional=T,
testDirection="over")
}

##############################################
# Run test
##############################################
OverCAT <- hyperGTest(paramsCond)

..........



Now, I think what I've done would be fine for you if you just created your own annotation package (environment?). I've done that before with use of this function and it worked fine for me. Building your own annotation package is not hard, but may be more trouble than you want to deal with to take this approach.

Good luck,
Wade



-----Original Message-----
From: Henrik Hjarvard de Fine Licht [mailto:Henrik_Hjarvard.de_Fine_Licht at biol.lu.se] 
Sent: Tuesday, January 03, 2012 2:07 AM
To: bioconductor at r-project.org; mcarlson at fhcrc.org
Subject: [BioC] PFAM gene enrichment test using GOstats and Category

Hi Marc

Thank you very much for your answer, and I'm really sorry for my much belated reply.

I have tried your suggestion, but if  I use KEGGFrame directly it gives me an error that the pfam ID's are not valid KEGG ID's (See below).

I suspect that what you have in mind is not what I have attempted here, but instead to create my data object manually by specifying a two column data.frame and the organism in a separate slot. I'm however unsure how to do this. Is there a "quick" way to do this or should I look into R object oriented programming syntax?

All the best, and many many thanks in advance for your reply Henrik

Rcode snippets:
> pfamframedata<-read.table("IsogroupAll_pfam.mysqlout", header=TRUE, 
> colClasses = "character")
> head(pfamframedata)
     pfam        isogid
1 PF00978 isogroup00001
2 PF05381 isogroup00001
3 PF01660 isogroup00001
4 PF05381 isogroup00001
5 PF01443 isogroup00001
6 PF01660 isogroup00001

> pfamFrame=KEGGFrame(pfamframedata, organism="XXXX")
Error in .testKEGGFrame(x, organism) :
  None of elements in the 1st column of your data.frame object are legitimate KEGG IDs.





Hi Henrik,

You are correct that there isn't a PFAMFrame yet.  But the usage situation for PFAM should be very similar to what happens with KEGG.  So theoretically, you should  be able to use a KEGGFrame for this purpose.
Have you tried this?


   Marc



On 11/25/2011 02:12 AM, Henrik De Fine Licht wrote:
> Dear List
>
> I want to do a hypergeometric testing of GO, KEGG and PFAM terms. I'm 
> using the GOstats package and the method described in the vignette: 
> "How to use GOstas and Category to do hypergeometric testing with 
> unsupported model organisms" by M. Carlson, Oct. 31. 2011, because my 
> organism is a non-model organisms.
>
> I have obtained the annotation for my gene sets from other sources and 
> the method described in the vignette is working perfect for GO and 
> KEGG. But now I would like to do the same for PFAM domains, but I'm 
> having trouble figuring out how to do this.
>
> It seems that there is no function for creating a PFAMFrame object 
> similar to the GOFrame and KEGGFrame functions, but I guess such an 
> object could be constructed by other means, but I'm unsure how this is done?
>
> Many thanks for your time and help
>
> best,
> Henrik Licht
>
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at 
> r-project.org<https://stat.ethz.ch/mailman/listinfo/bioconductor>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



________________________________

  *   Previous message: [BioC] PFAM gene enrichment test using GOstats and Category<https://stat.ethz.ch/pipermail/bioconductor/2011-November/042305.html>
  *   Next message: [BioC] solution for affy human 500k mapping snp array<https://stat.ethz.ch/pipermail/bioconductor/2011-November/042308.html>
  *   Messages sorted by: [ date ]<https://stat.ethz.ch/pipermail/bioconductor/2011-November/date.html#42359> [ thread ]<https://stat.ethz.ch/pipermail/bioconductor/2011-November/thread.html#42359> [ subject ]<https://stat.ethz.ch/pipermail/bioconductor/2011-November/subject.html#42359> [ author ]<https://stat.ethz.ch/pipermail/bioconductor/2011-November/author.html#42359>

________________________________
More information about the Bioconductor mailing list<https://stat.ethz.ch/mailman/listinfo/bioconductor>
-------------------------------------------------------------------------------------
Henrik Hjarvard de Fine Licht, PhD, Post Doctoral Researcher Centre for Genomic Ecology, Microbial Ecology Department of Biology, Lund University,
SE-223 62 Lund, Sweden
Email: henrik.de_fine_licht at biol.lu.se<mailto:henrik.de_fine_licht at biol.lu.se>
Website:  http://www.lu.se/henrik-de-fine-licht

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list