[R] Count number of consecutive zeros by group

S Ellison S.Ellison at LGCGroup.com
Thu Oct 31 19:26:56 CET 2013


> If I apply your function to my test data:
> 
....
> the result is
> 1 2 3
> 2 2 2
> 
...
> I think f2 does not return the max of consecutive zeros, but the max of any
> consecutve number... Any idea how to fix this?

The toy example of tapply using f2 does indeed return the maximum run lengths irrespective of the value repeated. 
If you want to select runs of a particular value, you can select according to use $values element of the rle object, again inside the function.
Modifying to accommodate that (and again avoiding a data frame name the same as a base R  function name - you managed it again!):

dfr <- data.frame(ID = c(1,1,1,2,2,3,3,3,3), x = c(1,0,0,0,0,1,1,0,1))

f3 <-   function(x) {
  runs <- rle(x == 0L) #Often wise to be careful with == and numbers ... see FAQ 7.31
  with(runs, max(lengths[values])) 
	#This works because in this case the values in 
	#$values are TRUE for x==0 and FALSE otherwise; see ?'[' for why those work 
}
with(dfr, tapply(x, ID, f3)) 

or, more or less equivalently but a shade more generally

f4 <-   function(x, select=0L) {
  runs <- rle(x )
  with(runs, max(lengths[values == select])) 
}
with(dfr, tapply(x, ID, f4)) 

None of this checks that runs of zero exist in a group; if they don't, you'll get warnings and -Inf in the output as max takes maxima of nothing. You can add extra checks inside the function if that bothers you. 




*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}



More information about the R-help mailing list