[BioC] Turning a GRanges Metadata Column into Rle List

Paul Leo p.leo at uq.edu.au
Tue Jan 15 07:38:33 CET 2013


If you want the maximum of the overlap you can use something  like this
too:

ir<-IRanges(start=you.intervals[,"start"],end=you.intervals[,"end"])
ir.encode.data<-IRanges(start=encode.data[,"start"],end=encode.data[,"end"])

over<-findOverlaps(ir,ir.encode.data)
query<-queryHits(over)
subject<-subjectHits(over)
    

tapply(as.numeric(encode.data[subject,"score"]),query,function(x)
max(x,na.rm=TRUE))

slight modification if have different chrs ....

This works  for getting GERP scores in indels etc 


Dr Paul Leo
Senior Bioinformatician
The University of Queensland Diamantina Institute
---------------------------------------------------------------------
TRI, level  ,  37 Kent Street,  Woolloongabba QLD 4102
Tel: +61 7 3443 7072  Mob: 041 303 8691  Fax: +61 7 3443 6966 

-----Original Message-----
From: Martin Morgan <mtmorgan at fhcrc.org>
To: D.Strbenac at garvan.org.au
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Turning a GRanges Metadata Column into Rle List
Date: Mon, 14 Jan 2013 22:17:17 -0800

On 01/14/2013 10:00 PM, Dario Strbenac wrote:
>> I was thinking you could just do (assuming score is always >= 0, and that
>> the ranges do not overlap, which seems to be the case from the initial
>> code):
>
> In another scenario, what if I have data on multiple cell lines, such as one
> of the ENCODE integrated tracks, and I would like to find the maximum value
> at each base within a specified genomic region ? In this case, the ranges in
> the supertrack would overlap often.
>
> I would imagine making a separate coverage RleList for each cell line, to
> avoid complications with overlapping ranges. Then, it would be nice to have a
> pmax function in the API that could take a number of RleList objects, each of
> the same length, and return one RleList. Are there any plans for pmin and
> pmax of this style ?

I think there is one?

   library(IRanges)
   showMethods(class="RleList", where=search())

 > r1 = RleList(a=Rle(c(0, 0, 1, 0)), b=Rle(c(1, 1, 0, 0)))
 > r2 = RleList(a=Rle(c(0, 0, 2, 2)), b=Rle(c(1, 2, 1, 1)))
 > pmax(r1, r2)
SimpleRleList of length 2
$a
numeric-Rle of length 4 with 2 runs
   Lengths: 2 2
   Values : 0 2

$b
numeric-Rle of length 4 with 3 runs
   Lengths: 1 1 2
   Values : 1 2 1


>
> _______________________________________________ Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list