[R] Finding "runs" of TRUE in binary vector

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Jan 27 23:49:27 CET 2005


Sean Davis <sdavis2 at mail.nih.gov> writes:

> I have a binary vector and I want to find all "regions" of that vector
> that are runs of TRUE (or FALSE).
> 
>  > a <- rnorm(10)
>  > b <- a<0.5
>  > b
>   [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
> 
> My function would return something like a list:
> region[[1]] 1,3
> region[[2]] 5,5
> region[[3]] 7,10
> 
> Any ideas besides looping and setting start and ends directly?

You could base it on

> rle(b)
Run Length Encoding
  lengths: int [1:5] 1 1 2 4 2
  values : logi [1:5]  TRUE FALSE  TRUE FALSE  TRUE
> b
 [1]  TRUE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE

(Notice that my b differs from yours)

then you might proceed with 

> end <- cumsum(rle(b)$lengths)
> start <- rev(length(b) + 1  - cumsum(rev(rle(b)$lengths)))
> # or:   start <- c(1, end[-length(end)] + 1)
> cbind(start,end)[rle(b)$values,]
     start end
[1,]     1   1
[2,]     3   4
[3,]     9  10


-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list