[R] counting sets of consecutive integers in a vector

Peter Alspach Peter.Alspach at plantandfood.co.nz
Mon Jan 5 02:27:06 CET 2015


Tena koe Mike

An alternative, which is slightly fast:

  diffv <- diff(v)
  starts <- c(1, which(diffv!=1)+1)
  cbind(v[starts], c(diff(starts), length(v)-starts[length(starts)]+1))

Peter Alspach

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Mike Miller
Sent: Monday, 5 January 2015 1:03 p.m.
To: R-Help List
Subject: [R] counting sets of consecutive integers in a vector

I have a vector of sorted positive integer values (e.g., postive integers after applying sort() and unique()).  For example, this:

c(1,2,5,6,7,8,25,30,31,32,33)

I want to make a matrix from that vector that has two columns: (1) the first value in every run of consecutive integer values, and (2) the corresponding number of consecutive values.  For example:

c(1:20) would become this...

1  20

...because there are 20 consecutive integers beginning with 1 and
c(1,2,5,6,7,8,25,30,31,32,33) would become

1  2
5  4
25 1
30 4

What would be the best way to accomplish this?  Here is my first effort:

v <- c(1,2,5,6,7,8,25,30,31,32,33)
L <- rle( v - 1:length(v) )$lengths
n <- length( L )
matrix( c( v[ c( 1, cumsum(L)+1 ) ][1:n], L), nrow=n)

      [,1] [,2]
[1,]    1    2
[2,]    5    4
[3,]   25    1
[4,]   30    4

I suppose that works well enough, but there may be a better way, and besides, I wouldn't want to deny anyone here the opportunity to solve a fun puzzle.  ;-)

The use for this is that I will be doing repeated seeks of a binary file to extract data.  seek() gives the starting point and readBin(n=X) gives the number of bytes to read.  So when there are many consecutive variables to be read, I can multiply the X in n=X by that number instead of doing many different seek() calls.  (The data are in a transposed format where I read in every record for some variable as sequential elements.)  I'm probably not the first person to deal with this.

Best,

Mike

-- 
Michael B. Miller, Ph.D.
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4AAAAJ

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
The contents of this e-mail are confidential and may be ...{{dropped:14}}



More information about the R-help mailing list