[BioC] speed of runmed on SimpleRleList
Janet Young
jayoung at fhcrc.org
Fri Feb 15 21:27:05 CET 2013
Hi there,
I've been using runmean on some coverage objects - nice. Runmed also seems useful, but also seems very slow (oddly so) - I'm wondering whether there's some easy improvement could be made there? Same issue with devel version and an older version. All should be explained in the code below (I hope).
thanks very much,
Janet
-------------------------------------------------------------------
Dr. Janet Young
Malik lab
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung ...at... fhcrc.org
-------------------------------------------------------------------
library(GenomicRanges)
### a small GRanges object (example from ?GRanges)
seqinfo <- Seqinfo(paste0("chr", 1:3), c(1000, 2000, 1500), NA, "mock1")
gr2 <-GRanges(seqnames =
Rle(c("chr1", "chr2", "chr1", "chr3"), c(1, 3, 2, 4)),
ranges = IRanges(1:10, width = 10:1, names = head(letters,10)),
strand = Rle(strand(c("-", "+", "*", "+", "-")),c(1, 2, 2, 3, 2)),
score = 1:10,
GC = seq(1, 0, length=10),
seqinfo=seqinfo)
gr2
cov <- coverage(gr2)
### runmed is slow! (for you Hutchies: this is on a rhino machine)
### I'm really trying to run this on some much bigger objects (whole genome coverage), where the slowness is more of an issue.
system.time(runmed(cov, 51))
# user system elapsed
# 1.120 0.016 1.518
### runmean is fast
system.time(runmean(cov, 51))
# user system elapsed
# 0.008 0.000 0.005
sessionInfo()
R Under development (unstable) (2012-10-03 r60868)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GenomicRanges_1.11.29 IRanges_1.17.32 BiocGenerics_0.5.6
loaded via a namespace (and not attached):
[1] stats4_2.16.0
## see a similar issue on my Mac, using older R
R version 2.15.1 (2012-06-22)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicRanges_1.10.1 IRanges_1.16.2 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] parallel_2.15.1 stats4_2.15.1
More information about the Bioconductor
mailing list