# [R] find jumps in vector of repeats

arun smartpink111 at yahoo.com
Sat Oct 19 06:54:57 CEST 2013

In addition to Bill's method, you may also use:

vec1 <- rep(c(1,2,3,4,5), c(10,30,24,65,3))
c(0,which(diff(vec2)!=0))
#or
indx <- cumsum(rle(vec2)\$lengths)
c(0,indx[-length(indx)])

#Bill's method was found to be the fastest

vec3 <- rep(vec1,1e4)
system.time( res <- c(0,which(diff(vec3)!=0)))
#   user  system elapsed
# 0.124   0.000   0.125
system.time({ indx <- cumsum(rle(vec3)\$lengths)
res2 <- c(0,indx[-length(indx)])})
#   user  system elapsed
#   0.112   0.000   0.112

system.time({ indx <- which(isLastInRun(vec3))
res3 <- c(0,indx[-length(indx)]) })
#   user  system elapsed
#  0.088   0.000   0.086
system.time({indx <- cumsum(c(0,abs(diff(vec3))))
indx2 <- tapply(seq_along(indx),list(indx),FUN=max)
res4 <- c(0,indx2[-length(indx2)]) })
#   user  system elapsed
#  2.456   0.000   2.457
names(res4)<-NULL
identical(res,res4)
#[1] TRUE
identical(res,res2)
#[1] TRUE
identical(res,res3)
#[1] TRUE

A.K.

On Friday, October 18, 2013 8:31 PM, William Dunlap <wdunlap at tibco.com> wrote:
> I have a very long vector (length=1855190) it looks something like this
>
> 1111...2222...3333....etc so it would be something equivalent of doing:
> rep(c(1,2,3,4,5), c(10,30,24,65,3))
>
> How can I find the index of where the step/jump is? For example using the above I would
> get an index of 0, 10, 40, 64, 129

Define 2 functions:
isFirstInRun <- function(x) c(TRUE, x[-1]!=x[-length(x)])
isLastInRun <- function(x) c(x[-1]!=x[-length(x)], TRUE)
and use them as
> z <- rep(c(1,2,3,4,5), c(10,30,24,65,3))
> which(isLastInRun(z))
[1]  10  40  64 129 132
> which(isFirstInRun(z))
[1]   1  11  41  65 130
(0 is not a valid R index into a vector, so I prefer one of
the above results, but you can fiddle with the endpoints
as you wish.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Benton, Paul
> Sent: Friday, October 18, 2013 5:18 PM
> To: r-help at r-project.org
> Subject: [R] find jumps in vector of repeats
>
> Hello all,
>
> I'm not really sure how to search for this in google/Rseek so there is probably a
> command to do it. I also know I could write an apply loop to find it but thought I would
> ask all you lovely R gurus.
>
> I have a very long vector (length=1855190) it looks something like this
>
> 1111...2222...3333....etc so it would be something equivalent of doing:
> rep(c(1,2,3,4,5), c(10,30,24,65,3))
>
> How can I find the index of where the step/jump is? For example using the above I would
> get an index of 0, 10, 40, 64, 129
>
> Any help would be greatly appreciated.
>
> Cheers,
>
> Paul
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help