[R] Efficient way of creating a shifted (lagged) variable?

Dimitri Liakhovitski dimitri.liakhovitski at gmail.com
Thu Aug 4 22:19:06 CEST 2011


Thanks a lot for the recommendations - some of them I am implementing already.

Just a clarification:
the only reason I try to compare things to SPSS is that I am the only
person in my office using R. Whenever I work on an R code my goal is
not just to make it work, but also to "boast" to the SPSS users that
it's much easier/faster/niftier in R. So, you are preaching to the
choir here.

Dimitri


On Thu, Aug 4, 2011 at 4:02 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
>
>
> On Aug 4, 2011, at 11:46, Dimitri Liakhovitski <dimitri.liakhovitski at gmail.com> wrote:
>
>> Thanks a lot, guys!
>> It's really helpful. But - to be objective- it's still quite a few
>> lines longer than in SPSS.
>
> Not once you've sources the function!  For the simple case of a vector, try:
>
> X <- 1:10
> mylag2 <- function(X, lag) {
>  c(rep(NA, length(seq(lag))), X[-seq(lag)])
> }
>
> Though this does not work for lead, it is fairly short. Then you could use the *apply family if you needed it on multiple columns or vectors.
>
> Cheers,
>
> Josh
>
>> Dimitri
>>
>> On Thu, Aug 4, 2011 at 2:36 PM, Daniel Nordlund <djnordlund at frontier.com> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>>> On Behalf Of Dimitri Liakhovitski
>>>> Sent: Thursday, August 04, 2011 8:24 AM
>>>> To: r-help
>>>> Subject: [R] Efficient way of creating a shifted (lagged) variable?
>>>>
>>>> Hello!
>>>>
>>>> I have a data set:
>>>> set.seed(123)
>>>> y<-data.frame(week=seq(as.Date("2010-01-03"), as.Date("2011-01-
>>>> 31"),by="week"))
>>>> y$var1<-c(1,2,3,round(rnorm(54),1))
>>>> y$var2<-c(10,20,30,round(rnorm(54),1))
>>>>
>>>> # All I need is to create lagged variables for var1 and var2. I looked
>>>> around a bit and found several ways of doing it. They all seem quite
>>>> complicated - while in SPSS it's just a few letters (like LAG()). Here
>>>> is what I've written but I wonder. It works - but maybe there is a
>>>> very simple way of doing it in R that I could not find?
>>>> I need the same for "lead" (opposite of lag).
>>>> Any hint is greatly appreciated!
>>>>
>>>> ### The function I created:
>>>> mylag <- function(x,max.lag=1){   # x has to be a 1-column data frame
>>>>    temp<-
>>>> as.data.frame(embed(c(rep(NA,max.lag),x[[1]]),max.lag+1))[2:(max.lag+1)]
>>>>    for(i in 1:length(temp)){
>>>>      names(temp)[i]<-paste(names(x),".lag",i,sep="")
>>>>     }
>>>>   return(temp)
>>>> }
>>>>
>>>> ### Running mylag to get my result:
>>>> myvars<-c("var1","var2")
>>>> for(i in myvars) {
>>>>   y<-cbind(y,mylag(y[i]),max.lag=2)
>>>> }
>>>> (y)
>>>>
>>>> --
>>>> Dimitri Liakhovitski
>>>> marketfusionanalytics.com
>>>>
>>>
>>> Dimitri,
>>>
>>> I would first look into the zoo package as has already been suggested.  However, if you haven't already got your solution then here are a couple of functions that might help you get started.  I won't vouch for efficiency.
>>>
>>>
>>> lag.fun <- function(df, x, max.lag=1) {
>>>  for(i in x) {
>>>    for(j in 1:max.lag){
>>>      lagx <- paste(i,'.lag',j,sep='')
>>>      df[,lagx] <- c(rep(NA,j),df[1:(nrow(df)-j),i])
>>>    }
>>>  }
>>>  df
>>> }
>>>
>>> lead.fun <- function(df, x, max.lead=1) {
>>>  for(i in x) {
>>>    for(j in 1:max.lead){
>>>      leadx <- paste(i,'.lead',j,sep='')
>>>      df[,leadx] <- c(df[(j+1):(nrow(df)),i],rep(NA,j))
>>>    }
>>>  }
>>>  df
>>> }
>>>
>>> y <- lag.fun(y,myvars,2)
>>> y <- lead.fun(y,myvars,2)
>>>
>>>
>>> Hope this is helpful,
>>>
>>> Dan
>>>
>>> Daniel Nordlund
>>> Bothell, WA USA
>>>
>>>
>>>
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>> marketfusionanalytics.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Dimitri Liakhovitski
marketfusionanalytics.com



More information about the R-help mailing list