[BioC] Turning a GRanges Metadata Column into Rle List

Paul Leo p.leo at uq.edu.au
Tue Jan 15 08:59:18 CET 2013


Actually this may be better. than my last post:

But making the Rle object of scores using "coverage and weights" (is the
only way I could think of). 

In won't work in cases where the intervals in  ir.encode.data are not
unique (that is the coverage values must always be one else you get
coverage*score instead of just score when you use the weights)

Is there a better way to make an Rle of non-contigious scores ???

### assume encode.data is table with columns start,end,score
### assume you.intervals is table with columns start,end


# ir<-IRanges(start=you.intervals[,"start"],end=you.intervals[,"end"])
ir.encode.data<-IRanges(start=encode.data[,"start"],end=encode.data[,"end"])


cov<-coverage(ir.encode.data,weight=as.numeric(ir.encode.data[,"score"])) ## DANGER HERE

myViews<-Views(cov,start=you.intervals[,"start"],end=you.intervals[,"end"] )
viewMaxs(myViews)

if ir.encode.data and ir are RleLists use 
regionViews <- RleViewsList(rleList = cov, rangesList =ir ) 
instesd of the Views

I'm not sure which solution is best but the above should use less
memory?


Dr Paul Leo
Senior Bioinformatician
The University of Queensland Diamantina Institute
---------------------------------------------------------------------
TRI, level  ,  37 Kent Street,  Woolloongabba QLD 4102
Tel: +61 7 3443 7072  Mob: 041 303 8691  Fax: +61 7 3443 6966 

-----Original Message-----
From: Paul Leo <p.leo at uq.edu.au>
Reply-to: <p.leo at uq.edu.au>
To: Martin Morgan <mtmorgan at fhcrc.org>
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Turning a GRanges Metadata Column into Rle List
Date: Tue, 15 Jan 2013 16:38:33 +1000

If you want the maximum of the overlap you can use something  like this
too:

ir<-IRanges(start=you.intervals[,"start"],end=you.intervals[,"end"])
ir.encode.data<-IRanges(start=encode.data[,"start"],end=encode.data[,"end"])

over<-findOverlaps(ir,ir.encode.data)
query<-queryHits(over)
subject<-subjectHits(over)
    

tapply(as.numeric(encode.data[subject,"score"]),query,function(x)
max(x,na.rm=TRUE))

slight modification if have different chrs ....

This works  for getting GERP scores in indels etc 


Dr Paul Leo
Senior Bioinformatician
The University of Queensland Diamantina Institute
---------------------------------------------------------------------
TRI, level  ,  37 Kent Street,  Woolloongabba QLD 4102
Tel: +61 7 3443 7072  Mob: 041 303 8691  Fax: +61 7 3443 6966 

-----Original Message-----
From: Martin Morgan <mtmorgan at fhcrc.org>
To: D.Strbenac at garvan.org.au
Cc: bioconductor at r-project.org
Subject: Re: [BioC] Turning a GRanges Metadata Column into Rle List
Date: Mon, 14 Jan 2013 22:17:17 -0800

On 01/14/2013 10:00 PM, Dario Strbenac wrote:
>> I was thinking you could just do (assuming score is always >= 0, and that
>> the ranges do not overlap, which seems to be the case from the initial
>> code):
>
> In another scenario, what if I have data on multiple cell lines, such as one
> of the ENCODE integrated tracks, and I would like to find the maximum value
> at each base within a specified genomic region ? In this case, the ranges in
> the supertrack would overlap often.
>
> I would imagine making a separate coverage RleList for each cell line, to
> avoid complications with overlapping ranges. Then, it would be nice to have a
> pmax function in the API that could take a number of RleList objects, each of
> the same length, and return one RleList. Are there any plans for pmin and
> pmax of this style ?

I think there is one?

   library(IRanges)
   showMethods(class="RleList", where=search())

 > r1 = RleList(a=Rle(c(0, 0, 1, 0)), b=Rle(c(1, 1, 0, 0)))
 > r2 = RleList(a=Rle(c(0, 0, 2, 2)), b=Rle(c(1, 2, 1, 1)))
 > pmax(r1, r2)
SimpleRleList of length 2
$a
numeric-Rle of length 4 with 2 runs
   Lengths: 2 2
   Values : 0 2

$b
numeric-Rle of length 4 with 3 runs
   Lengths: 1 1 2
   Values : 1 2 1


>
> _______________________________________________ Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

_______________________________________________
Bioconductor mailing list
Bioconductor at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list