[R] find & remove sequences of at least N values for a specific value

jeff6868 geoffrey_klein at etu.u-bourgogne.fr
Thu Jul 10 14:34:22 CEST 2014

Hi everybody,

I have a small problem in a function, about removing short sequences of
identical numeric values.

For the example, we can consider this data, containing only some "0" and

test <- data.frame(x=c(0,0,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1))

The aim of my purpose here is simply to remove each sequence of "1" with a
length shorter than 5, and to keep sequences of "1" which are bigger than 5.
So my final data should look like this:

final <- data.frame(x=c(0,0,NA,NA,NA,0,0,0,0,1,1,1,1,1,1,1,1))

For the moment, I have this function:

    foo <- function(X,N){
      tab <- table(X[X==1])
      under.n <- as.numeric(names(tab)[tab<N]) 
      ind <- X %in% under.n
      Ind.sup <- which(ind)
      X <- ifelse(ind,NA,X)

test$x <- apply(as.data.frame(test$x),2,function(x) foo(x,5))

The problem is that the function doesn't consider each sequence separately,
but only one sequence. I think that adding rle() instead of table() in my
function should to the trick, but it doesn't work yet. 
Does someone have an idea about fixing this problem?

View this message in context: http://r.789695.n4.nabble.com/find-remove-sequences-of-at-least-N-values-for-a-specific-value-tp4693810.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list