[BioC] Simple pathway enrichment analysis for gene lists

Enrico Ferrero enricoferrero86 at gmail.com
Tue Sep 10 18:23:57 CEST 2013


Hi Paul,

Thanks for your suggestions.
I already performed a GO enrichment analysis and I was specifically
looking at the enrichment of pathways. You also suggest GSEA, but,
correct me if I'm wrong, I think I would need some additional value to
rank my list of gene IDs, as the GSEA algorithm requires a ranked list
to start with.
I guess at the moment my best option is to use Category and the
outdated KEGG.db to have an idea of the enriched pathways and then use
some non-Bioconductor tool (any suggestion?).

I'm not sure this is the right place, but, on a more general note, may
I ask if there are any plans to provide better and more integrated
support for pathway enrichment analysis in Bioconductor?
For example, would it be possible to build something similar to an up
to date KEGG.db using KEGG's REST API? After all, packages such as
gage and ROntoTools get their information from KEGG in that way.
Alternatively, while not as comprehensive as KEGG, Reactome is
completely open source and open access and I would be surprised if
there wasn't a more fruitful way to integrate it with the core
Bioconductor tools an packages.

I'm probably being naive here, but I'm afraid I don't fully understand
what is the general consensus of the Bioconductor community on current
status and future directions of pathway analysis tools.

Thank you.
Best,

On 9 September 2013 16:25, Paul Shannon <paul.thurmond.shannon at gmail.com> wrote:
> Hi Enrico,
>
> reactome.db is best described, I believe, as simply a bioc rendering of Reactome's sql database -- Marc, please correct me if I am wrong.
>
> There is thus a data representation obstacle when using reactome.db: molecular relations are described, with pathway/gene mappings not so easy to get at.
> In addition, and despite Reactome's many strengths, its coverage is incomplete. The canonical wnt pathway, for instance, is (at my last check) not included.
>
> If you have a list of geneIDs, exploratory analysis can usefully start out with both GO enrichment, KEGG enrichment, and GSEA. Though the information in KEGG.db has not been updated in a couple of years, the information there is still very useful for exploratory data analysis. Any enrichments you discover using these assorted gene/ctaegory associations may lead you to a close study of particular functions or pathways, and it is this point that you may wish to get the latest and most specific information via KEGGREST and Reactome (and, with our next release) the new PSICQUIC package (see http://code.google.com/p/psicquic/).
>
> I hope this helps. Let us know if it falls short, or if new questions arise.
>
> - Paul
>
>
> On Sep 9, 2013, at 7:53 AM, Enrico Ferrero wrote:
>
>> Dear list,
>>
>> Can anybody suggest how to perform a simple pathway enrichment
>> analysis starting from a list of gene IDs?
>>
>> I know about the gage and ROntoTools packages that use KEGGREST to
>> retrieve an up to date version of the KEGG database, but, as far as I
>> understand, they require a microarray experiment as input (or at least
>> fold changes and pvalues).
>>
>> Since this time around I'm not starting from a microarray experiment
>> but I just have a gene list, I'm looking for a way to perform pathway
>> enrichment analysis using a simple numerical method such as Fisher's /
>> hypergeometric test.
>>
>> I know the Category package still provides a KEGGHyperG class (which
>> would be perfect!), but the results are based on the outdated version
>> of KEGG (via KEGG.db, I guess).
>>
>> Are there any good alternatives available out there? Would it be
>> possible to use reactome.db in conjunction with the Category/GOstats
>> functions for example?
>>
>> Thank you!
>> Best,
>>
>> --
>> Enrico Ferrero
>> Department of Genetics
>> Cambridge Systems Biology Centre
>> University of Cambridge
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Enrico Ferrero
Department of Genetics
Cambridge Systems Biology Centre
University of Cambridge



More information about the Bioconductor mailing list