[R] sequences extraction

GREGOR Brian J Brian.J.GREGOR at odot.state.or.us
Fri Apr 20 20:00:22 CEST 2007


Here's a solution which uses diff() and apply.

findSequences <- function(Data){
     # Sort the vector in case it hasn't been sorted
     Data <- sort(Data)
     # Check that there are no duplicate values in Data
     if(any(duplicated(Data))) stop("Function with not work if Data
argument contains duplicate values.")
     # Make a matrix of the starting and ending indices of sequences
     Breaks <- which(diff(Data) != 1)
     Starts <- c(0, Breaks) + 1
     Ends <- c(Breaks, length(Data)) 
     SequenceIndices <- cbind(Starts, Ends)
     # Return a list of vectors of sequences
     apply(SequenceIndices, 1, function(x) Data[x[1]:x[2]])
     }

>> I need to extract sequences from an ordered vector.
>> For example, if
>> a<-c(1,2,3,6,10,11,13)
>> I need to get the followings 4 vectors 
>> (1,2,3),(6),(10,11),(13)

>There should be a more elegant way to do it, but the following code
>seems to work (it returns the results a s a list) :


>a<-c(9,1,2,3,6,10,11,17,13,14)
>d <- diff(a)
>result <- list()
>tmp <- a[1]            
>for (i in 1:length(d)) {  
>        if (d[i]==1) {
>                tmp <- c(tmp, a[i+1])
>        } else {
>                result <- c(result,list(tmp))
>                tmp <- a[i+1]         
>        }
>}
>result <- c(result,list(tmp))   
>result


Brian Gregor, P.E.
Transportation Planning Analysis Unit
Oregon Department of Transportation
Brian.J.GREGOR at odot.state.or.us
(503) 986-4120



More information about the R-help mailing list