# [R] subset only if f.e a column is successive for more than 3 values

William Dunlap wdun|@p @end|ng |rom t|bco@com
Fri Sep 28 17:22:59 CEST 2018

```Do you also want lines 38 and 39 (in addition to 40:44), or do I

When you deal with runs of data, think of the rle (run-length encoding)
function.  E.g. here is
a barely tested function to find runs of a given minimum length and a given
difference between
successive values.  It also returns a 'runNumber' so you can split the
result into runs.

findRuns <- function(x, minRunLength=3, difference=1) {
# for integral x, find runs of length at least 'minRunLength'
# with 'difference' between succesive values
d <- diff(x)
dRle <- rle(d)
w <- rep(dRle\$lengths>=minRunLength-1 & dRle\$values==difference,
dRle\$lengths)
values <- x[c(FALSE,w) | c(w,FALSE)]
runNumber <- cumsum(c(TRUE, diff(values)!=difference))
data.frame(values=values, runNumber=runNumber)
}

> findRuns(c(10,8,6,4,1,2,3,20,17,18,19,20))
values runNumber
1      1         1
2      2         1
3      3         1
4     17         2
5     18         2
6     19         2
7     20         2
> findRuns(c(10,8,6,4,1,2,3,20,17,18,19,20), minRunLength=4)
values runNumber
1     17         1
2     18         1
3     19         1
4     20         1
> findRuns(c(10,8,6,4,1,2,3,20,17,18,19,20), difference=-2)
values runNumber
1     10         1
2      8         1
3      6         1
4      4         1

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 27, 2018 at 7:48 AM, Knut Krueger <rhelp using krueger-family.de>
wrote:

> Hi to all
>
> I need a subset for values if there are f.e 3 values successive in a
> column of a Data Frame:
> Example from the subset help page:
>
> subset(airquality, Temp > 80, select = c(Ozone, Temp))
> 29     45   81
> 35     NA   84
> 36     NA   85
> 38     29   82
> 39     NA   87
> 40     71   90
> 41     39   87
> 42     NA   93
> 43     NA   92
> 44     23   82
> .....
>
> I would like to get only
>
> ...
> 40     71   90
> 41     39   87
> 42     NA   93
> 43     NA   92
> 44     23   82
> ....
>
> because the left column is ascending more than f.e three times without gap
>
> Any hints for a package or do I need to build a own function?
>
> Kind Regards Knut
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help