[R] fusion of overlapping intervals

arun smartpink111 at yahoo.com
Mon Nov 5 21:26:21 CET 2012


HI,

May be you should check this link (http://r.789695.n4.nabble.com/R-overlapping-intervals-td810061.html).


dat1<-structure(list(chr = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), start = c(5, 30, 49, 70, 100, 129),
    end = c(10, 52, 101, 103, 130, 140)), .Names = c("chr", "start",
"end"), row.names = c(NA, -6L), class = "data.frame")

Using Jim's code:
fun1<-function(x){
x1<-x2<-logical(max(x[,2],x[,3]))
x1[unlist(mapply(seq,x[,2],x[,3]))]<-TRUE
 x2[unlist(mapply(seq,x[,2],x[,3]))]<-TRUE
r<-rle(x1 & x2)
offset<-cumsum(r$lengths)
cbind(offset[r$values]-r$lengths[r$values] +1,offset[r$values])}

 list1<-lapply(split(dat1,dat1$chr),function(x) x)
 res<-do.call(rbind,lapply(list1,function(x) data.frame(chr=names(list1)[match.call()[[2]][[3]]],fun1(x))))
rownames(res)<-1:nrow(res)
 colnames(res)<-colnames(dat1)
 res
#  chr start end
#1   a     5  10
#2   a    30 101
#3   b    70 140

A.K.





----- Original Message -----
From: Hermann Norpois <hnorpois at googlemail.com>
To: r-help at r-project.org
Cc: 
Sent: Monday, November 5, 2012 12:14 PM
Subject: [R] fusion of overlapping intervals

Hello,

I have start and end coordinates from different experiments (DNase
hypersensitivity data) and now I would like to combine overlapping
intervals. For instance (see my test data below) (2) 30-52 and (3) 49-101
are combined to 30-101. But 49-101 and 70-103 would not be combined because
they are on different chromosomes (chr a and chr b).
Does anybody have an idea?
Thanks
Hermann

> df
  chr start end
1   a     5  10
2   a    30  52
3   a    49 101
4   b    70  103
5   b   100 130
6   b   129 140
> dput (df)
structure(list(chr = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("a",
"b"), class = "factor"), start = c(5, 30, 49, 70, 100, 129),
    end = c(10, 52, 101, 103, 130, 140)), .Names = c("chr", "start",
"end"), row.names = c(NA, -6L), class = "data.frame")

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





More information about the R-help mailing list