[BioC] read counts in sliding windows

Hervé Pagès hpages at fhcrc.org
Fri Aug 12 00:54:11 CEST 2011


Hi Jason,

On 11-08-02 07:46 PM, Jason Lu wrote:
> Hi list,
>
> I would like to seek your advice with this.
> Here I have a GRangesList and a GappedAlignments objects (an alignment
> to the genome). For each GRanges (with length 1) in the GRangesList,
                            ^^^^^^^^^^^^^^^^^^^^^^^
Are you saying each GRanges in your GRangesList object contains only
1 range? Wouldn't then make sense to unlist your GRangesList object to
work instead with 1 big GRanges object? It's just a little bit easier
to work with a GRanges object than with a GRangesList object.

> I would like to do a sliding window on its ranges and count reads from
> each window. So far I wasn't able to find an efficient way to do this.

Assuming you are now using a GRanges object (gr0) instead of a
GRangesList object. Have you tried to loop over the sliding window
instead of looping over the ranges in 'gr0'? This will probably
lead to a much smaller loop as the nb of times you move the sliding
window is likely to be much smaller than the nb of ranges in 'gr0'.
To be a little bit more explicit: for each position of the sliding
window, generate a GRanges object 'gr' that is of the same length
as 'gr0' and where each range represents the current position of
the window. Then do 'countOverlaps()' inside the loop.
If you use sapply() for the looping, you will end up with a matrix
where the nb of cols is the nb of positions of the sliding window.

> My problem is similar to this one:
> http://permalink.gmane.org/gmane.science.biology.informatics.conductor/34431.
> But looping over the GRangesList (and using countOverlaps inside the
> loop) takes significant time.

If you show us the code you are using for doing this then we'll know
exactly what you are trying to do.

Cheers,
H.

>
> Thanks for any suggestions.
>
> Best,
>
> Jason
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list