[R] Efficient way of creating a shifted (lagged) variable?

Daniel Nordlund djnordlund at frontier.com
Thu Aug 4 21:22:41 CEST 2011



> -----Original Message-----
> From: Dimitri Liakhovitski [mailto:dimitri.liakhovitski at gmail.com]
> Sent: Thursday, August 04, 2011 11:47 AM
> To: Daniel Nordlund; r-help
> Subject: Re: [R] Efficient way of creating a shifted (lagged) variable?
> 
> Thanks a lot, guys!
> It's really helpful. But - to be objective- it's still quite a few
> lines longer than in SPSS.
> Dimitri
> 
> On Thu, Aug 4, 2011 at 2:36 PM, Daniel Nordlund <djnordlund at frontier.com>
> wrote:
> >
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org]
> >> On Behalf Of Dimitri Liakhovitski
> >> Sent: Thursday, August 04, 2011 8:24 AM
> >> To: r-help
> >> Subject: [R] Efficient way of creating a shifted (lagged) variable?
> >>
> >> Hello!
> >>
> >> I have a data set:
> >> set.seed(123)
> >> y<-data.frame(week=seq(as.Date("2010-01-03"), as.Date("2011-01-
> >> 31"),by="week"))
> >> y$var1<-c(1,2,3,round(rnorm(54),1))
> >> y$var2<-c(10,20,30,round(rnorm(54),1))
> >>
> >> # All I need is to create lagged variables for var1 and var2. I looked
> >> around a bit and found several ways of doing it. They all seem quite
> >> complicated - while in SPSS it's just a few letters (like LAG()). Here
> >> is what I've written but I wonder. It works - but maybe there is a
> >> very simple way of doing it in R that I could not find?
> >> I need the same for "lead" (opposite of lag).
> >> Any hint is greatly appreciated!
> >>
> >> ### The function I created:
> >> mylag <- function(x,max.lag=1){   # x has to be a 1-column data frame
> >>    temp<-
> >>
> as.data.frame(embed(c(rep(NA,max.lag),x[[1]]),max.lag+1))[2:(max.lag+1)]
> >>    for(i in 1:length(temp)){
> >>      names(temp)[i]<-paste(names(x),".lag",i,sep="")
> >>     }
> >>   return(temp)
> >> }
> >>
> >> ### Running mylag to get my result:
> >> myvars<-c("var1","var2")
> >> for(i in myvars) {
> >>   y<-cbind(y,mylag(y[i]),max.lag=2)
> >> }
> >> (y)
> >>
> >> --
> >> Dimitri Liakhovitski
> >> marketfusionanalytics.com
> >>
> >
> > Dimitri,
> >
> > I would first look into the zoo package as has already been suggested.
>  However, if you haven't already got your solution then here are a couple
> of functions that might help you get started.  I won't vouch for
> efficiency.
> >
> >
> > lag.fun <- function(df, x, max.lag=1) {
> >  for(i in x) {
> >    for(j in 1:max.lag){
> >      lagx <- paste(i,'.lag',j,sep='')
> >      df[,lagx] <- c(rep(NA,j),df[1:(nrow(df)-j),i])
> >    }
> >  }
> >  df
> > }
> >
> > lead.fun <- function(df, x, max.lead=1) {
> >  for(i in x) {
> >    for(j in 1:max.lead){
> >      leadx <- paste(i,'.lead',j,sep='')
> >      df[,leadx] <- c(df[(j+1):(nrow(df)),i],rep(NA,j))
> >    }
> >  }
> >  df
> > }
> >
> > y <- lag.fun(y,myvars,2)
> > y <- lead.fun(y,myvars,2)
> >
> >
> >
> >
> >
> 
> 

Dimitri,

I (and probably a lot of others on the list) don't know SPSS anymore.  I haven't used it in 30 years.  So, I don't know how you would use LAG() in SPSS to achieve what you want, and you didn't give us any example of how you would like to be able to use a lag function in your code.  Without at least some pseudo code demonstrating the simple usage you are looking for, it is hard to give you code that works the way you want.  That being said, you can always use SPSS.

Dan

Daniel Nordlund
Bothell, WA USA



More information about the R-help mailing list