[BioC] Human, Mouse and Rat homologs

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Thu May 20 20:54:24 CEST 2010


Lets say your data is in a data frame called "d", then the code might be:

> d
  probe_id      ensembl_id
1  8039748 ENSG00000121410
2  7960947 ENSG00000175899
3  8144857 ENSG00000171428
4  8144866 ENSG00000156006
5  7976496 ENSG00000196136
6  8083415 ENSG00000114771

> 
> library(biomaRt)
> 
> mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
> 
> h2m <- getBM(attributes=c("ensembl_gene_id","mouse_ensembl_gene"), mart=mart)
> 
> my.h2m <- merge(d, h2m, by.x="ensembl_id", by.y="ensembl_gene_id", sort=FALSE)

> my.h2m
        ensembl_id probe_id mouse_ensembl_gene
1  ENSG00000121410  8039748 ENSMUSG00000022347
2  ENSG00000175899  7960947 ENSMUSG00000030111
3  ENSG00000171428  8144857 ENSMUSG00000025588
4  ENSG00000171428  8144857 ENSMUSG00000051147
5  ENSG00000171428  8144857 ENSMUSG00000056426
6  ENSG00000156006  8144866 ENSMUSG00000051147
7  ENSG00000156006  8144866 ENSMUSG00000056426
8  ENSG00000156006  8144866 ENSMUSG00000025588
9  ENSG00000196136  7976496 ENSMUSG00000066363
10 ENSG00000196136  7976496 ENSMUSG00000041536
11 ENSG00000196136  7976496 ENSMUSG00000066364
12 ENSG00000196136  7976496 ENSMUSG00000058207
13 ENSG00000196136  7976496 ENSMUSG00000079012
14 ENSG00000196136  7976496 ENSMUSG00000079013
15 ENSG00000196136  7976496 ENSMUSG00000021091
16 ENSG00000196136  7976496 ENSMUSG00000066361
17 ENSG00000196136  7976496 ENSMUSG00000041449
18 ENSG00000196136  7976496 ENSMUSG00000041481
19 ENSG00000114771  8083415 ENSMUSG00000027761

________________________________________
From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Wolfgang Huber [whuber at embl.de]
Sent: 20 May 2010 19:40
To: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] Human, Mouse and Rat homologs

Dear David

one of the possible solutions is via the BioMart interface to the
Ensembl databases. Please check the getLDS function in the biomaRt
package, which is described in that package's vignette.

        Best wishes
        Wolfgang

  Lyon scripsit 20/05/10 04:54:
> If I had a file containing a list of Human:
>
> 1)Refseq IDs:
>
> "probe_id" "accession"
> "1" "8039748" "NM_130786"
> "2" "8039748" "NP_570602"
> "3" "7960947" "NM_000014"
> "4" "7960947" "NP_000005"
> "5" "8144857" "NM_000662"
> "6" "8144857" "NM_001160170"
>
> Or
>
> 2)Ensemble genes:
>
> "probe_id" "ensembl_id"
> "1" "8039748" "ENSG00000121410"
> "2" "7960947" "ENSG00000175899"
> "3" "8144857" "ENSG00000171428"
> "4" "8144866" "ENSG00000156006"
> "5" "7976496" "ENSG00000196136"
> "6" "8083415" "ENSG00000114771"
>
>
> which R package does the conversion of the list of IDs to find the Mouse homologs and can someone type the exact command?
>
> Thank you for your consideration.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


--


Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list