[BioC] topGO enrichment using ensembl gene list

Julien Roux Julien.Roux at unil.ch
Mon Mar 31 08:53:08 CEST 2008


Hello list,

I am using the package "topGO" to analyse GO enrichment of gene sets:

My genes are ensembl IDs and are not taken from a microarray, so I had 
to feed "topGOdata" with a gene2GO list.
(see 
http://thread.gmane.org/gmane.science.biology.informatics.conductor/14627)
I construct that list by mapping all ensembl IDs to GO IDs using the 
package "biomaRt".
Then I proceed with my analysis:

 > GOdata <- new("topGOdata", ontology = "MF", allGenes = selectedList, 
description = "Ensembl GO enrichment", annot = annFUN.gene2GO, gene2GO = 
gene2GO)

Do you confirm this approach is correct?

I also had several question concerning topGO:
- Are the p-value in topGO corrected for multiple testing (FDR...)? My 
guess is that they are not due to a problem of independence...
- Are there some differences between Fisher exact test (topGO) and 
Hypergeometric test (GOstats). If yes, why did the two packages make 
different choices?
- It is not clear to me what the Kolmogorov-Smirnov is testing? 
Especially in my case where I don't provide scores associated to my genes...
- Is there a way to test separately over/under representation of GO 
categories?
 
Thanks a lot in advance for your help or tips
Julien

-- 
Julien Roux, PhD student
http://www.unil.ch/dee/page22707.html
Department of Ecology and Evolution
Biophore, University of Lausanne, 1015 Lausanne, Switzerland
tel: +41 21 692 4221    fax: +41 21 692 4165



More information about the Bioconductor mailing list