[BioC] GSEA, topGO, GOstats...? what's a good way to look at GO over-representation?

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Wed Feb 10 01:49:32 CET 2010


Jose

You weren't being cryptic, the packages are!

I have put some code on the use of topGO on our website:

http://bioinformatics.iah.ac.uk/sample-code

I hope this of use to you and others

Thanks
Mick

________________________________________
From: J.delasHeras at ed.ac.uk [J.delasHeras at ed.ac.uk]
Sent: 08 February 2010 19:29
To: michael watson (IAH-C)
Cc: bioconductor
Subject: RE: [BioC] GSEA, topGO, GOstats...? what's a good way to look at       GO over-representation?

Hi Mick,

I didn't think I was cryptic at all, but I'm sorry I didn't make it
clear enough.
I was just looking for a good way to do GO over-representation
analysis starting from say entrez gene IDs, without being tied to a
particular array definition (if GO ids are needed I can fish them out).

I think that my difficulty with some of these packages (topGO
included) was generating an annotation package, generating the correct
data structure, and the examples being Affy-centric. I'm sure it seems
very simple, once I figured it out, but not right now.

I have looked a tools like FatiGO and FatiScan etc... but I was never
very happy with them. Admittedly this was a while ago and matters may
have improved.
Sometimes I felt that some genes didn't have a matching GO term
despite my knowing that there was GO information for it, and it often
took very long for me to get the results. Maybe I'll have a look
again, but I'd much rather keep the work in R *if reasonable*: scripts
one can reuse, every step is documented, etc etc.

Thanks for your comments, I'll look more closely at topGO.

Jose




Quoting "michael watson (IAH-C)" <michael.watson at bbsrc.ac.uk>:

> These all are a little cryptic!
>
> I have some sample code for topGO that doesn't use AFFY ids, it uses
>  a dataset that I can't give out, but at least it's not affy.
>
> I ended up writing my own code to do this that works from
> data.frames etc and sucks the latest annotation directly from the
> web, rather than using the bioc annotation packages.  Some of this
> was wrapped into our package CORNA
> (http://bioinformatics.iah.ac.uk/software/corna)
>
> Also, are you devoted to R?  If not, then why not use something like
>  FatiGo?  http://www.fatigo.org/
>
> Mick
>
> -----Original Message-----
> From: bioconductor-bounces at stat.math.ethz.ch
> [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of
> J.delasHeras at ed.ac.uk
> Sent: 08 February 2010 16:30
> To: bioconductor
> Subject: [BioC] GSEA, topGO, GOstats...? what's a good way to look
> at GO over-representation?
>
>
> Dear list,
>
> I have a few gene lists derived from a human Illumina expression
> array. I just have Illumina IDs, I have gene names, and I have entrez
> gene IDs I obtained for them.
>
> I would like to analyse the list to look for over-representation of
> some category, probably using gene ontologies.
> I see there are several packages that seem to address this, although
> when I look at the examples I get the feeling they were designed with
> Affy arrays in mind and depend on an Affy array design...
>
> I am sure I am not the only one wanting to do this type of work on
> non-Affy arrays... I would appreciate a nudge towards the right
> package, or a way to "persuade" it to work with non-Affy array data,
> after all I imagine that all the array design is used for is the
> definition of teh genelists/universe and retrieval of the relevant GO
> ids.
>
> Thank you for any helpful comments.
>
> Jose
>
> --
> Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK
> *********************************************
> NEW EMAIL from July'09: nach.mcnach at gmail.com
> *********************************************
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



--
Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
*********************************************
NEW EMAIL from July'09: nach.mcnach at gmail.com
*********************************************

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


More information about the Bioconductor mailing list