[BioC] biomaRt: using a list as values. confused...

Steffen Durinck durinck.steffen at gene.com
Thu Jun 9 18:40:41 CEST 2011


Hi Jose,

I'll make biomaRt throw an error when someone tries the query you attempted.

Cheers,
Steffen

On Thu, Jun 9, 2011 at 9:15 AM,  <J.delasHeras at ed.ac.uk> wrote:
>
> Hi Stephen,
>
> many thanks for that. I was looking at the previous results that I said were
> ok and realised the ranges were wrong and that confused me even more!
>
> Thanks for teh tip about the chromosomal region, that's just what I needed!
>
> Jose
>
>
> Quoting Steffen Durinck <durinck.steffen at gene.com> on Thu, 9 Jun 2011
> 09:12:59 -0700:
>
>> Hi Jose,
>>
>> the combo filter chr + start + end is a special situation and is
>> interpreted as give me everything in between.  It is porbably not well
>> documented but this however filter combo works only for a single
>> region at a time so your second example is correct there are only few
>> genes in your region on chr1.
>>
>> An alternative which does work for multiple regions is to use the
>> chromosomal_region filter like:
>>
>> regions<-c("1:11401198:11694590", "2:86460656:86663869")
>> attributes<-c("hgnc_symbol", "entrezgene",
>> "chromosome_name","start_position", "end_position", "strand", "band")
>>
>> getBM(attributes=attributes,filters="chromosomal_region",values=regions,mart=ensembl)
>>
>> Cheers,
>> Steffen
>>
>> On Thu, Jun 9, 2011 at 8:49 AM,  <J.delasHeras at ed.ac.uk> wrote:
>>>
>>> I'm trying to obtain information about genes within a number of regions
>>> defined by a chromosome name, start and end coordinates.
>>>
>>> I understand that the way to specify multiple filters to be used together
>>> (a
>>> set of chr+start+end) is to use a list for 'values'.
>>>
>>> This seems to work ok when I have more than one region (I tested it using
>>> two regions first, before doing the proper search for >1000), but if I
>>> were
>>> to specify just one region, it does not work... and I'm wondering how I
>>> would do it in that case.
>>>
>>> Example:
>>>
>>> library("biomaRt")
>>> ensembl = useMart("ENSEMBL_MART_ENSEMBL",
>>>   dataset="hsapiens_gene_ensembl",
>>>   host="www.ensembl.org")
>>>
>>> chrom<-c("1", "2")
>>> chr.start<-c(11401198, 86460656)
>>> chr.stop<-c(11694590, 86663869)
>>>
>>> attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name",
>>> "start_position", "end_position", "strand", "band")
>>>
>>>
>>> # extract both regions at once:
>>> getBM(attributes=attributes,
>>>      filters=c("chromosome_name","start","end"),
>>>      values=list(chrom,chr.start,chr.stop),mart=ensembl)
>>> #this works, returning 1939 rows of data, the first 1198 with chr1
>>> #corresponding to teh first region, and the rest with chr2 to teh second.
>>> Good.
>>>
>>> #but how does one retrieve the data for just ONE region?
>>> # try this:
>>> getBM(attributes=attributes,
>>>      filters=c("chromosome_name","start","end"),
>>>      values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl)
>>> # it only returns one gene!!! (in two rows)
>>>
>>> so, when I just want to do a single search with multiple filters, how
>>> would
>>> I specify the values?
>>>
>>> Jose
>>>
>>> --
>>> Dr. Jose I. de las Heras                      Email:
>>> J.delasHeras at ed.ac.uk
>>> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6507090
>>> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
>>> Swann Building, Mayfield Road
>>> University of Edinburgh
>>> Edinburgh EH9 3JR
>>> UK
>>>
>>>
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>
>>
>
>
>
> --
> Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6507090
> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>



More information about the Bioconductor mailing list