[BioC] BiomaRt Ensembl RefSeq query error

Georg Otto georg.otto at imm.ox.ac.uk
Tue Jan 21 13:09:52 CET 2014


Dear Bioconductors,

I am trying to query 14005 Ensembl gene IDs for their Refseq annotations
using this code (I can send the gene IDs upon request):

ensembl <- useMart("ensembl", dataset = 'mmusculus_gene_ensembl')

getBM(attributes = c("ensembl_gene_id",
                      "refseq_mrna"), filter="ensembl_gene_id",
                    ensembl.ids,
                    mart = ensembl, uniqueRows = TRUE)


If I query for the full gene set, many RefSeq IDs are missing (NA), for
example for the gene ENSMUSG00000000567 (sox9), whereas if I query for a
subset, say ensembl.ids[1:12000], all the RefSeq IDs are there. It does
not seem to matter which subset I use, but the size of the subset has to
be smaller than ca. 12000 genes.

Any idea what is going on?

Best wishes,

Georg



More information about the Bioconductor mailing list