[R] counting sets of consecutive integers in a vector

Mike Miller mbmiller+l at gmail.com
Mon Jan 5 01:03:03 CET 2015


I have a vector of sorted positive integer values (e.g., postive integers 
after applying sort() and unique()).  For example, this:

c(1,2,5,6,7,8,25,30,31,32,33)

I want to make a matrix from that vector that has two columns: (1) the 
first value in every run of consecutive integer values, and (2) the 
corresponding number of consecutive values.  For example:

c(1:20) would become this...

1  20

...because there are 20 consecutive integers beginning with 1 and 
c(1,2,5,6,7,8,25,30,31,32,33) would become

1  2
5  4
25 1
30 4

What would be the best way to accomplish this?  Here is my first effort:

v <- c(1,2,5,6,7,8,25,30,31,32,33)
L <- rle( v - 1:length(v) )$lengths
n <- length( L )
matrix( c( v[ c( 1, cumsum(L)+1 ) ][1:n], L), nrow=n)

      [,1] [,2]
[1,]    1    2
[2,]    5    4
[3,]   25    1
[4,]   30    4

I suppose that works well enough, but there may be a better way, and 
besides, I wouldn't want to deny anyone here the opportunity to solve a 
fun puzzle.  ;-)

The use for this is that I will be doing repeated seeks of a binary file 
to extract data.  seek() gives the starting point and readBin(n=X) gives 
the number of bytes to read.  So when there are many consecutive variables 
to be read, I can multiply the X in n=X by that number instead of doing 
many different seek() calls.  (The data are in a transposed format where I 
read in every record for some variable as sequential elements.)  I'm 
probably not the first person to deal with this.

Best,

Mike

-- 
Michael B. Miller, Ph.D.
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4AAAAJ



More information about the R-help mailing list