[R] counting sets of consecutive integers in a vector

Mike Miller mbmiller+l at gmail.com
Mon Jan 5 01:03:03 CET 2015

I have a vector of sorted positive integer values (e.g., postive integers 
after applying sort() and unique()).  For example, this:


I want to make a matrix from that vector that has two columns: (1) the 
first value in every run of consecutive integer values, and (2) the 
corresponding number of consecutive values.  For example:

c(1:20) would become this...

1  20

...because there are 20 consecutive integers beginning with 1 and 
c(1,2,5,6,7,8,25,30,31,32,33) would become

1  2
5  4
25 1
30 4

What would be the best way to accomplish this?  Here is my first effort:

v <- c(1,2,5,6,7,8,25,30,31,32,33)
L <- rle( v - 1:length(v) )$lengths
n <- length( L )
matrix( c( v[ c( 1, cumsum(L)+1 ) ][1:n], L), nrow=n)

      [,1] [,2]
[1,]    1    2
[2,]    5    4
[3,]   25    1
[4,]   30    4

I suppose that works well enough, but there may be a better way, and 
besides, I wouldn't want to deny anyone here the opportunity to solve a 
fun puzzle.  ;-)

The use for this is that I will be doing repeated seeks of a binary file 
to extract data.  seek() gives the starting point and readBin(n=X) gives 
the number of bytes to read.  So when there are many consecutive variables 
to be read, I can multiply the X in n=X by that number instead of doing 
many different seek() calls.  (The data are in a transposed format where I 
read in every record for some variable as sequential elements.)  I'm 
probably not the first person to deal with this.



Michael B. Miller, Ph.D.
University of Minnesota

More information about the R-help mailing list