[BioC] fastest way to keep score when reduce Granges data

Hervé Pagès hpages at fhcrc.org
Wed Feb 26 02:21:32 CET 2014


Hi Jianhong,

It would help enormously if you could send code that we can actually
run. Thanks!

H.


On 02/24/2014 07:53 AM, Ou, Jianhong wrote:
> Hi ALL,
>
> I want to reduce a GRanges data by fixed window size and keep scores after reduce. My code is
>
> .dat <- GRanges("chr1", Iranges(start=1:50, width=2), strand="+", score=Sample(1:50, 50))
> windowSize <- 10
> Grwin <- GRanges("chr1", IRanges(start=(0:5)*windowSize+scale[1]-1,
>                                            width=windowSize), strand="+")
> ol <- findOverlaps(.dat, GRwin)
> ol <- as.data.frame(ol)
> ol <- ol[!duplicated(ol[,1]),]
> .dat <- split(.dat, ol[,2])
> reduceValue <- function(.datReduce){
>                  .datReduceM <- reduce(.datReduce, with.mapping=TRUE)
>                  wid <- width(.datReduce)
>                  .datReduceScore <- .datReduce$value
>                  .datReduceM$score <- sapply(.datReduceM$mapping, function(.idx){
>                      round(sum(.datReduceScore[.idx]*wid[.idx])/sum(wid[.idx]))
>                  })
>                  .datReduceM$mapping <- NULL
>                  .datReduceM
>              }
> .dat <- lapply(.dat, reduceValue)
> .dat <- unlist(GRangesList(.dat))
>
> But the efficiency is very low. What is the best way to keep scores when reduce GRanges data by fixed window size? Thanks for your help.
>
> Yours sincerely,
>
> Jianhong Ou
>
> LRB 670A
> Program in Gene Function and Expression
> 364 Plantation Street Worcester,
> MA 01605
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioconductor mailing list