[BioC] enrichment + topGO
pshannon at fhcrc.org
Tue Oct 2 18:16:40 CEST 2012
On Oct 2, 2012, at 8:04 AM, Maryam wrote:
> I am trying to use topGO but I have problems using a list of gene names as I dont know which annotation function I should use for that and what would be the respective parameters for that. It would be nice if I could find samples of using topGO with ENSEMBL ids or gene names.
Bioconductor annotations tend to favor entrez geneIDs. However, it is straigthtforward to convert from ENSEMBL ids, or HUGO gene symbols, to entrez ids. If you are working with human data, this might help.
# fist take a look at one of the tables provided, so you understand the basic structure
1 95 ENSG00000114786
2 582 ENSG00000256349
3 593 ENSG00000255730
4 715 ENSG00000159403
5 1201 ENSG00000261832
6 1734 ENSG00000211448
# now create a sample set of ensembl ids to convert
ensembl.ids <- head (toTable(org.Hs.egENSEMBL2EG))$ensembl_id
# do the conversion. all of these will convert, but the 'ifnotfound' argument covers odd cases
> mget(ensembl.ids, org.Hs.egENSEMBL2EG, ifnotfound=NA)
 "95" "100526760"
You will probably need to check for double assignment, doing some web curation at NCBI for instance. Checking for NAs would be wise also.
Once you have converted your identifiers, the topGO vignette should be useful.
Hope this helps.
> Apart from this issue, I want to change topGO in a way that it can accept other ontologies rather than GO.
> Has anyone know how to do that?
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor