[BioC] problem getting biotype in biomaRt

Rhoda Kinsella rhoda at ebi.ac.uk
Tue Jan 13 11:41:24 CET 2009


Hi Steffen and Elizabeth,
I have had a look through the ensembl mart configuration and have  
found an error which may fix the current gene and transcript biotype  
problem.
The pointer attribute for the structure_biotype is still pointing to  
biotype so I will change this to point to gene_biotype and this should  
solve the issue. I will
implement this change for release 53 (approximately mid February). My  
apologies for any inconvenience and thank you for reporting this  
problem.
Regards,
Rhoda


On 13 Jan 2009, at 10:08, Rhoda Kinsella wrote:

> Hi Steffen and Elizabeth,
> The biotype attribute was changed into gene_biotype and  
> transcript_biotype after a user requested that
> we provide the transcript_biotype information. I have carried out  
> the query below on the Ensembl mart web interface and
> there are no errors reported. Steffan, can you provide me with some  
> more information about where you think
> the source of the problem is and I can help look into this?
> Kind regards,
> Rhoda
>
>
>
>
> On 12 Jan 2009, at 20:48, steffen at stat.berkeley.edu wrote:
>
>> Hi Elizabeth,
>>
>> The biotype attribute seem to have changed into a separate  
>> gene_biotype
>> and transcript_biotype  these two represent the same info.
>>
>> These two new attributes however are indeed currently not  
>> retrievable and
>> I am investigating what causes this. It looks like it is on the  
>> BioMart
>> side.
>>
>> Cheers,
>> Steffen
>>
>>
>>
>>> Hi,
>>> I am trying to pull down information from Ensembl using biomaRt  
>>> and I
>>> can't get the relevant biotype information (for Human). The old
>>> 'biotype' attribute doesn't exist, so what I see is 'gene_biotype'  
>>> and
>>> 'structure_biotype'. I have no idea what the difference is, but I  
>>> can't
>>> get either one. The error says it's probably an internal error to be
>>> reported, but I also get this when I try to bring down what I  
>>> think are
>>> incompatible attributes.
>>> Thanks,
>>> Elizabeth
>>>
>>>> library(biomaRt)
>>>> mart<-useMart("ensembl",dataset= "hsapiens_gene_ensembl")
>>> Checking attributes and filters ... ok
>>>> martAttr<-listAttributes(mart)
>>>> att<-c("ensembl_gene_id",
>>> +                 "ensembl_transcript_id",
>>> +                 "ensembl_exon_id",
>>> +                  "exon_chrom_start",
>>> +                  "exon_chrom_end",
>>> +                  "strand",
>>> +                  "chromosome_name",
>>> +                  "rank",
>>> +                  "3_utr_start","3_utr_end",
>>> +                  "5_utr_start","5_utr_end"
>>> +                  )
>>>> all(att%in%martAttr[,1]) #valid names for the mart
>>> [1] TRUE
>>> #works fine here
>>>> tempGene <-
>>> getBM(att,filter="ensembl_gene_id",value="ENSG00000187634",mart =  
>>> mart)
>>> #error
>>>> tempGene <-
>>> getBM 
>>> (c 
>>> (att 
>>> ,"gene_biotype 
>>> "),filter="ensembl_gene_id",value="ENSG00000187634",mart
>>> = mart)
>>>
>>>                                V1
>>> 1 Query ERROR: caught BioMart::Exception::Usage: Attributes from
>>> multiple attribute pages are not allowed
>>> Error in getBM(c(att, "gene_biotype"), filter = "ensembl_gene_id",  
>>> value
>>> = "ENSG00000187634",  :
>>>  Number of columns in the query result doesn't equal number of
>>> attributes in query.  This is probably an internal error, please  
>>> report.
>>> #again an error
>>>> tempGene <-
>>> getBM 
>>> (c 
>>> (att 
>>> ,"structure_biotype 
>>> "),filter="ensembl_gene_id",value="ENSG00000187634",mart
>>> = mart)
>>>
>>>   V1
>>> 1 Query ERROR: caught BioMart::Exception::Usage: Attribute biotype  
>>> NOT
>>> FOUND
>>> Error in getBM(c(att, "structure_biotype"), filter =  
>>> "ensembl_gene_id",  :
>>>  Number of columns in the query result doesn't equal number of
>>> attributes in query.  This is probably an internal error, please  
>>> report.
>>>> sessionInfo()
>>> R version 2.8.1 (2008-12-22)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
>>> States.1252;LC_MONETARY=English_United
>>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>> other attached packages:
>>> [1] biomaRt_1.16.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] RCurl_0.93-0 tools_2.8.1  XML_1.99-0
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> Rhoda Kinsella Ph.D.
> Ensembl Bioinformatician,
> European Bioinformatics Institute (EMBL-EBI),
> Wellcome Trust Genome Campus,
> Hinxton
> Cambridge CB10 1SD,
> UK.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

Rhoda Kinsella Ph.D.
Ensembl Bioinformatician,
European Bioinformatics Institute (EMBL-EBI),
Wellcome Trust Genome Campus,
Hinxton
Cambridge CB10 1SD,
UK.



More information about the Bioconductor mailing list