[BioC] biomaRt problem

Wolfgang Huber whuber at embl.de
Tue Aug 3 23:21:40 CEST 2010

Hi Anupam Singha

as always, following the posting guide (output sessionInfo(), and a 
reproducible example that does not depend on a private file that exists 
only on your computer) would be useful.

Your getBM query seems incomplete, since you do specify an argument for 
'values', but not for 'filters'. So, in effect, your values are ignored 
and no filtering is performed - the query is made on all ~51,000 genes 
in the dataset.

Third, why the attribute "hsapiens_dn" is NA for all genes in the 
dataset is a question I need to pass to someone more familiar with this 
particular dataset - I will forward it to the Ensembl helpdesk.

Here's a code example:

mart = useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")

filters = listFilters(mart)
attrs = listAttributes(mart)

print("hsapiens_dn" %in% filters$name)
print("hsapiens_dn" %in% attrs$name)

res = getBM(attributes = c("ensembl_gene_id", "hsapiens_dn"),
                  mart = mart)


and its output

[1] TRUE


R version 2.12.0 Under development (unstable) (2010-08-02 r52661)
Platform: x86_64-apple-darwin10.4.0 (64-bit)

[1] C/C/C/C/C/it_IT

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] biomaRt_2.5.1  fortunes_1.3-7

loaded via a namespace (and not attached):
[1] RCurl_1.4-3 XML_3.1-0

	Best wishes

Wolfgang Huber

On Jul/27/10 3:51 PM, anupam sinha wrote:
> Hi all,
>             I am trying to download dN/dS values for human genes from
> ensemble using biomaRt. I have used codes from the website :
> http://www.r-bloggers.com/biomart-and-biomart/
>> library("biomaRt")
>> mart<- useMart(biomart="ensembl", dataset="hsapiens_gene_
> ensembl")
>> genes<- read.csv("file.txt") (this file contains hgnc gene symbols for
> Homo sapiens)
>> results<- getBM(attributes = c("ensembl_gene_id","hsapiens_dn"),values =
> genes$hsapiens_dn, mart = mart)
> But all the values of hsapiens_dn are shown to be "NA". The first few lines
> of the output
>> head(results,10)
>     ensembl_gene_id hsapiens_dn
> 1  ENSG00000215781          NA
> 2  ENSG00000243259          NA
> 3  ENSG00000225566          NA
> 4  ENSG00000189096          NA
> 5  ENSG00000215750          NA
> 6  ENSG00000212884          NA
> 7  ENSG00000212886          NA
> 8  ENSG00000229617          NA
> 9  ENSG00000241176          NA
> 10 ENSG00000215705          NA
> Can anyone please tell me where am I going wrong ? . Thanks in advance for
> any suggestions

More information about the Bioconductor mailing list