[BioC] [topGO] More flexible way to select 'interesting' genes

Wed Aug 7 14:46:23 CEST 2013

Hi,

Adrian, I'm using your topGO package and really appreciate how
powerful and customizable it is with respect to the choice of
algorithms and statistical tests.

There's just one limitation that I don't really understand, hopefully
you or somebody else in the list can shed some light on this.

I usually select 'interesting' genes from gene expression experiments
based on two different parameters: their p-value and their
log(fold-change).

As far as I understand, if I want to run a GO enrichment analysis in
topGO using a statistical test that uses ranked gene lists such as KS
(Kolmogorov–Smirnov test, also used by GSEA), I can only filter on one
paramater (tipically the p-value).

This is a consequence of the way the topGOdata objects are built, e.g.:

myData <- new("topGOdata", description="myData", ontology="BP",
allGenes=myAllGenes, geneSel=geneSelFunc, nodeSize=5, annot=annFUN.db,
affyLib="hgu133plus2.db")

where:

- myAllGenes is a named vector of all p-values for each probe on the
array, named after their probeID and

- geneSelFunc is a function to select the interesting ones, such as:

geneSelFunc <- function (score) {
    return(score <= 0.05)
}

I'm basically looking for a more flexible way to perform the selection
of my interesting probes: for example I'd like to only select probes
that have a p-value<=0.05 and a |log(fold-change)| >= 1.

Is there any way to do this?

Thank you.
Best,

-- 
Enrico Ferrero
PhD Student
Department of Genetics
Cambridge Systems Biology Centre
University of Cambridge