[BioC] Summing Views on coverage by base

Hervé Pagès hpages at fhcrc.org
Tue Mar 20 23:36:19 CET 2012


On 03/20/2012 03:32 PM, Martin Morgan wrote:
> On 03/20/2012 03:10 PM, Sean Davis wrote:
>> 2012/3/20 Martin Morgan<mtmorgan at fhcrc.org>:
>>> On 03/20/2012 01:40 PM, Hervé Pagès wrote:
>>>> Hi Sean,
>>>>
>>>> On 03/20/2012 01:14 PM, Sean Davis wrote:
>>>>> I have a set of Views of equal width (think upstream of tss) and want
>>>>> to sum each base across those views. I can extract each view as an
>>>>> integer vector and create a matrix, but this matrix can get pretty
>>>>> large. I'm missing the skills with SimpleRleViewsList, though, to
>>>>> work directly on at object. Any suggestions?
>>>>
>>>> > subject<- Rle(rep(c(0L, 1L, 3L, 2L, 18L, 0L), c(3,2,1,5,2,4)))
>>>> > myViews<- Views(subject, start=4:11, width=5)
>>>> > myViews
>>>> Views on a 17-length Rle subject
>>>>
>>>> views:
>>>> start end width
>>>> [1] 4 8 5 [1 1 3 2 2]
>>>> [2] 5 9 5 [1 3 2 2 2]
>>>> [3] 6 10 5 [3 2 2 2 2]
>>>> [4] 7 11 5 [2 2 2 2 2]
>>>> [5] 8 12 5 [ 2 2 2 2 18]
>>>> [6] 9 13 5 [ 2 2 2 18 18]
>>>> [7] 10 14 5 [ 2 2 18 18 0]
>>>> [8] 11 15 5 [ 2 18 18 0 0]
>>>>
>>>> This maybe would be fast enough if you don't have too many columns:
>>>>
>>>> viewColSums<- function(x)
>>>> {
>>>> sapply(seq_len(width(x)[1L]),
>>>> function(i)
>>>> sum(subject[start(x)+i-1L]))
>>>> }
>>>>
>>>> > viewColSums(myViews)
>>>> [1] 15 32 49 46 44
>>>
>>>
>>> Reduce("+", myViews)
>>
>> Nice.
>
> maybe too many class coercions in there, though
>
> Reduce(function(x, y) x + as.integer(y), myViews, 0L)
>
> which seems like a step or so back...

FWIW, the main difference between using viewColSums() and Reduce()
is that the former loops on the columns while the latter loops on the
rows. So what's best for you depends on the geometry of your Views
object.

Cheers,
H.


>
> Martin
>
>>
>> Sean
>>
>>
>>>>
>>>> Then if your SimpleRleViewsList object is not too long (1 elt per
>>>> chromosome?), you can sapply( , viewColSums) on it.
>>>>
>>>> Maybe we should make viewColSums the "colSums" method for RleViews
>>>> objects? (and eventually implement it in C?)
>>>>
>>>> Cheers,
>>>> H.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Sean
>>>>>
>>>>>> sessionInfo()
>>>>> R Under development (unstable) (2012-01-19 r58141)
>>>>> Platform: i386-apple-darwin9.8.0/i386 (32-bit)
>>>>>
>>>>> locale:
>>>>> [1] C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats graphics grDevices utils datasets methods base
>>>>>
>>>>> other attached packages:
>>>>> [1] GenomicRanges_1.7.30 IRanges_1.13.28 BiocGenerics_0.1.12
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] stats4_2.15.0 tools_2.15.0
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>
>>>
>>>
>>> --
>>> Computational Biology
>>> Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>>>
>>> Location: M1-B861
>>> Telephone: 206 667-2793
>
>


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list