[R] Carrying a value down a data.frame conditionally

William Dunlap wdunlap at tibco.com
Wed Dec 24 17:29:14 CET 2014


A while ago I wrote for a questioner on this list a function, 'f1', below,
that would give the start and stop times of runs of data that started when
then the data went above a threshold and stopped when it first dropped
below a different (lower) threshold).  It used no loops and was pretty
quick.

With your data you could use it as
  > ss <- with(df, f1( (Value>=40)+Signal*2, start=2, stop=1))
  > ss
    start stop
  1     2    4
  2    10   10
You can convert those start and stop times to a vector with 1's in the runs
and 0's outside of the runs with something like
  > v <- integer(length(df$Value))
  > v[ss$start] <- 1
  > v[pmain(ss$stop+1, length(v))] <- -1
  > cumsum(v)
   [1] 0 1 1 1 0 0 0 0 0 1 0 0

f1 would be trivial to write in C/C++.  It needs a better name.


f1 <-
function(x, startThreshold, stopThreshold, plot=FALSE) {
    # find intervals that
    #  start when x goes above startThreshold and
    #  end when x goes below stopThreshold.
    stopifnot(startThreshold > stopThreshold)
    isFirstInRun <- function(x)c(TRUE, x[-1] != x[-length(x)])
    isLastInRun <- function(x)c(x[-1] != x[-length(x)], TRUE)
    isOverStart <- x >= startThreshold
    isOverStop <- x >= stopThreshold
    possibleStartPt <- which(isFirstInRun(isOverStart) & isOverStart)
    possibleStopPt <- which(isLastInRun(isOverStop) & isOverStop)
    pts <- c(possibleStartPt, possibleStopPt)
    names(pts) <- rep(c("start","stop"),
      c(length(possibleStartPt), length(possibleStopPt)))
    pts <- pts[order(pts)]
    tmp <- isFirstInRun(names(pts))
    start <- pts[tmp & names(pts)=="start"]
    stop <- pts[tmp & names(pts)=="stop"]
    # Remove case where first downcrossing happens
    # before first upcrossing.
    if (length(stop) > length(start)) stop <- stop[-1]

    if (plot) {
        plot(x, cex=.5)
        abline(h=c(startThreshold, stopThreshold))
        abline(v=start, col="green")
        abline(v=stop, col="red")
    }
    data.frame(start=start, stop=stop)
}


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Dec 23, 2014 at 2:57 PM, Pooya Lalehzari <plalehzari at platinumlp.com>
wrote:
>
> Hello,
> I have a data.frame (below) containing the two fields of "Value" and
> "Signal" and I would need to create the third field of "To_Be_Produced".
> The condition for producing the third field is to carry the 1 in the
> "Signal" field down until "Value" is below 40.
> Do I have to create a for-loop to do this or will I be able to do anything
> else more efficient?
>
>
> df <- data.frame( Value=c(0,0,100,85,39,1,30,40,20,20,0,0),
>                   Signal=c(0,1,0,0,0,0,0,0,0,1,0,0),
>                   To_Be_Produced= c(0,1,1,1,0,0,0,0,0,1,0,0)
>                 )
>
> Thank you,
> Pooya.
>
>
>
>
> ***
> We are pleased to announce that, as of October 20th, 2014, we've moved to
> our new office at:
> Platinum Partners
> 250 West 55th Street, 14th Floor, New York, NY 10019
> T: 212.582.2222 | F: 212.582.2424
> ***
> THIS E-MAIL IS FOR THE SOLE USE OF THE INTENDED RECIPIENT(S) AND MAY
> CONTAIN
> CONFIDENTIAL AND PRIVILEGED INFORMATION.ANY UNAUTHORIZED REVIEW, USE,
> DISCLOSURE
> OR DISTRIBUTION IS PROHIBITED. IF YOU ARE NOT THE INTENDED RECIPIENT,
> PLEASE
> CONTACT THE SENDER BY REPLY E-MAIL AND DESTROY ALL COPIES OF THE ORIGINAL
> E-MAIL.
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list