[BioC] biomaRt: using a list as values. confused...

Steffen Durinck durinck.steffen at gene.com
Thu Jun 9 18:12:59 CEST 2011


Hi Jose,

the combo filter chr + start + end is a special situation and is
interpreted as give me everything in between.  It is porbably not well
documented but this however filter combo works only for a single
region at a time so your second example is correct there are only few
genes in your region on chr1.

An alternative which does work for multiple regions is to use the
chromosomal_region filter like:

regions<-c("1:11401198:11694590", "2:86460656:86663869")
attributes<-c("hgnc_symbol", "entrezgene",
"chromosome_name","start_position", "end_position", "strand", "band")
getBM(attributes=attributes,filters="chromosomal_region",values=regions,mart=ensembl)

Cheers,
Steffen

On Thu, Jun 9, 2011 at 8:49 AM,  <J.delasHeras at ed.ac.uk> wrote:
>
> I'm trying to obtain information about genes within a number of regions
> defined by a chromosome name, start and end coordinates.
>
> I understand that the way to specify multiple filters to be used together (a
> set of chr+start+end) is to use a list for 'values'.
>
> This seems to work ok when I have more than one region (I tested it using
> two regions first, before doing the proper search for >1000), but if I were
> to specify just one region, it does not work... and I'm wondering how I
> would do it in that case.
>
> Example:
>
> library("biomaRt")
> ensembl = useMart("ENSEMBL_MART_ENSEMBL",
>   dataset="hsapiens_gene_ensembl",
>   host="www.ensembl.org")
>
> chrom<-c("1", "2")
> chr.start<-c(11401198, 86460656)
> chr.stop<-c(11694590, 86663869)
>
> attributes<-c("hgnc_symbol", "entrezgene", "chromosome_name",
> "start_position", "end_position", "strand", "band")
>
>
> # extract both regions at once:
> getBM(attributes=attributes,
>      filters=c("chromosome_name","start","end"),
>      values=list(chrom,chr.start,chr.stop),mart=ensembl)
> #this works, returning 1939 rows of data, the first 1198 with chr1
> #corresponding to teh first region, and the rest with chr2 to teh second.
> Good.
>
> #but how does one retrieve the data for just ONE region?
> # try this:
> getBM(attributes=attributes,
>      filters=c("chromosome_name","start","end"),
>      values=list(chrom[1],chr.start[1],chr.stop[1]),mart=ensembl)
> # it only returns one gene!!! (in two rows)
>
> so, when I just want to do a single search with multiple filters, how would
> I specify the values?
>
> Jose
>
> --
> Dr. Jose I. de las Heras                      Email: J.delasHeras at ed.ac.uk
> The Wellcome Trust Centre for Cell Biology    Phone: +44 (0)131 6507090
> Institute for Cell & Molecular Biology        Fax:   +44 (0)131 6507360
> Swann Building, Mayfield Road
> University of Edinburgh
> Edinburgh EH9 3JR
> UK
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list