[BioC] biomaRt: using a list as values. confused...

J.delasHeras at ed.ac.uk J.delasHeras at ed.ac.uk
Thu Jun 9 19:39:29 CEST 2011


THat's probably a good idea. Most people would realise the result is  
not the expected one, but it will be better to find an error and be  
safe.

thank you!

Jose


Quoting Steffen Durinck <durinck.steffen at gene.com> on Thu, 9 Jun 2011  
09:40:41 -0700:

> Hi Jose,
>
> I'll make biomaRt throw an error when someone tries the query you attempted.
>
> Cheers,
> Steffen
>
> On Thu, Jun 9, 2011 at 9:15 AM,  <J.delasHeras at ed.ac.uk> wrote:
>>
>> Hi Stephen,
>>
>> many thanks for that. I was looking at the previous results that I said were
>> ok and realised the ranges were wrong and that confused me even more!
>>
>> Thanks for teh tip about the chromosomal region, that's just what I needed!
>>
>> Jose
>>
>>
>> Quoting Steffen Durinck <durinck.steffen at gene.com> on Thu, 9 Jun 2011
>> 09:12:59 -0700:
>>
>>> Hi Jose,
>>>
>>> the combo filter chr + start + end is a special situation and is
>>> interpreted as give me everything in between.  It is porbably not well
>>> documented but this however filter combo works only for a single
>>> region at a time so your second example is correct there are only few
>>> genes in your region on chr1.
>>>
>>> An alternative which does work for multiple regions is to use the
>>> chromosomal_region filter like:
>>>
>>> regions<-c("1:11401198:11694590", "2:86460656:86663869")
>>> attributes<-c("hgnc_symbol", "entrezgene",
>>> "chromosome_name","start_position", "end_position", "strand", "band")
>>>
>>> getBM(attributes=attributes,filters="chromosomal_region",values=regions,mart=ensembl)
>>>
>>> Cheers,
>>> Steffen
>>>
>>> On Thu, Jun 9, 2011 at 8:49 AM,  <J.delasHeras at ed.ac.uk> wrote:
>>>>
>>>> I'm trying to obtain information about genes within a number of regions
>>>> defined by a chromosome name, start and end coordinates.
>>>>
>>>> I understand that the way to specify multiple filters to be used together
>>>> (a
>>>> set of chr+start+end) is to use a list for 'values'.
>>>>
>>>> This seems to work ok when I have more than one region (I tested it using
>>>> two regions first, before doing the proper search for >1000), but if I
>>>> were
>>>> to specify just one region, it does not work... and I'm wondering how I
>>>> would do it in that case.
>>>>
>>>> Example:
>>>>
>>>> library("biomaRt")
>>>> ensembl = useMart("ENSEMBL_MART_ENSEMBL",
>>>>   dataset="hsapiens_gene_ensembl",
>>>>   host="www.ensembl.org")
>>>>
>>>> chrom<-c("1", "2")
>>>> chr.start<-c(11401198, 86460656)
>>>> chr.stop<-c(11694590, 86663869)
>>>>
>>>> attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name",
>>>> "start_position", "end_position", "strand", "band")
>>>>
>>>>
>>>> # extract both regions at once:
>>>> getBM(attributes=attributes,
>>>>      filters=c("chromosome_name","start","end"),
>>>>      values=list(chrom,chr.start,chr.stop),mart=ensembl)
>>>> #this works, returning 1939 rows of data, the first 1198 with chr1
>>>> #corresponding to teh first region, and the rest with chr2 to teh second.
>>>> Good.
>>>>
>>>> #but how does one retrieve the data for just ONE region?
>>>> # try this:
>>>> getBM(attributes=attributes,
>>>>      filters=c("chromosome_name","start","end"),
>>>>      values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl)
>>>> # it only returns one gene!!! (in two rows)
>>>>
>>>> so, when I just want to do a single search with multiple filters, how
>>>> would
>>>> I specify the values?
>>>>
>>>> Jose
>>>>
>>>> --
>>>> Dr. Jose I. de las Heras                      Email:
>>>> J.delasHeras at ed.ac.uk
>>>> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6507090
>>>> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
>>>> Swann Building, Mayfield Road
>>>> University of Edinburgh
>>>> Edinburgh EH9 3JR
>>>> UK
>>>>
>>>>
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>>
>>
>>
>>
>> --
>> Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
>> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6507090
>> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
>> Swann Building, Mayfield Road
>> University of Edinburgh
>> Edinburgh EH9 3JR
>> UK
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>>
>
>



-- 
Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6507090
Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



More information about the Bioconductor mailing list