[BioC] protein orthologs

Marc Carlson mcarlson at fhcrc.org
Fri Mar 25 18:18:07 CET 2011


Hi Stefanie,

If you choose to use Inparanoid, it sounds like you will want to access 
the wrapped database directly using DBI as the R mappings do not expose 
the score and related information that it sounds you might be interested 
in.  This is easy enough to do using the DBI package, I just wanted to 
give you a heads up that there is more in there than you will 
immediately see from R.

But yeah, they don't pass along percent identity in the datasets they 
expose (so you won't find that in our databases either).


  Marc



On 03/23/2011 07:28 PM, Stefanie Carola Gerstberger wrote:
> Dear list,
> I'm trying to retrieve protein orthologs for human proteins for the most common model organism (mouse,C.elegans,Drosophila,S.cerevisae,etc) such that for my work in human I can point other researchers to the appropriate homolog in other species.
>
> I was first using Ensembl Compara through biomart (Genome Res. 2009 Feb;19(2):327-35. Epub 2008 Nov 24. EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates.) to query those orthologs,
>
> but then I came across inparanoid as a second resource.(Nucleic Acids Res. 2010 Jan;38(Database issue):D196-203. Epub 2009 Nov 5. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.)
>
> I also came across SIMAP as a protein homology resource (Nucleic Acids Res. 2010 Jan;38(Database issue):D223-6. Epub 2009 Nov 11. SIMAP--a comprehensive database of pre-calculated protein sequence similarities, domains, annotations and clusters.)  which was used for a S.cerevisae database MIPS.
>
> I'm not experienced at all in this area and reading the publications did not give me a good indication on which one is the better resource for ortholog relationships. What I noticed and did not like from inparanoid is that (at least the website) does not give any % identities (but I haven't checked the R package yet).
>
> I would like to know which protein orthology database is commonly accepted as a standard resource in the protein evolutionary field.
> What is commonly used as reference in protein evolution labs?
> Can anyone give me advice on this?
>
> Thanks,
> Stefanie
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list