[R] Lag based on Date objects with non-consecutive values

Sam Albers tonightsthenight at gmail.com
Mon Mar 19 21:10:47 CET 2012


Hello all,

I need to figure out a way to lag a variable in by a number of days
without using the zoo package. I need to use a remote R connection
that doesn't have the zoo package installed and is unwilling to do so.
So that is, I want a function where I can specify the number of days
to lag a variable against a Date formatted column. That is relatively
easy to do. The problem arises when I don't have consecutive dates. I
can't seem to figure out a way to insert an NA when there is
non-consecutive date. So for example:


## A dataframe with non-consecutive dates
set.seed(32)
df1<-data.frame(
           Date=seq(as.Date("1967-06-05","%Y-%m-%d"),by="day", length=5),
           Dis1=rnorm(5, 1,10)
           )
df2<-data.frame(
  Date=seq(as.Date("1967-07-05","%Y-%m-%d"),by="day", length=10),
  Dis1=rnorm(5, 1,10)
  )

df <- rbind(df1,df2); df

## A function to lag the variable by a specified number of days
lag.day <- function (lag.by, data) {
  c(rep(NA,lag.by), head(data$Dis1, -lag.by))
}

## Using the function
df$lag1 <- lag.day(lag.by=1, data=df); df
## returns this data frame

         Date      Dis1      lag1
1  1967-06-05  1.146405        NA
2  1967-06-06  9.732887  1.146405
3  1967-06-07 -9.279462  9.732887
4  1967-06-08  7.856646 -9.279462
5  1967-06-09  5.494370  7.856646
6  1967-06-15  5.070176  5.494370
7  1967-06-16  3.847314  5.070176
8  1967-06-17 -5.243094  3.847314
9  1967-06-18  9.396560 -5.243094
10 1967-06-19  4.112792  9.396560


## When really what I would like is something like this:

         Date      Dis1      lag1
1  1967-06-05  1.146405        NA
2  1967-06-06  9.732887  1.146405
3  1967-06-07 -9.279462  9.732887
4  1967-06-08  7.856646 -9.279462
5  1967-06-09  5.494370  7.856646
6  1967-06-15  5.070176  NA
7  1967-06-16  3.847314  5.070176
8  1967-06-17 -5.243094  3.847314
9  1967-06-18  9.396560 -5.243094
10 1967-06-19  4.112792  9.396560

So can anyone recommend a way (either using my function or any other
approaches) that I might be able to consistently lag values based on a
lag.by value and consecutive dates?

Thanks so much in advance!

Sam



More information about the R-help mailing list