[R] Chronological data manipulation question

Julien Barnier jbarnier at ens-lsh.fr
Tue Oct 16 12:43:51 CEST 2007


Hi all,

I currently work on a survey which contains biographical data stored
in a chronological way, ie something like :

id      year     variable
001     2000     0
001     2001     0
001     2002     1
001     2003     0
002     1996     0
002     1997     0
002     1998     1
002     1999     0
002     2000     0

where id is a person identifier, year the year of observation and
variable the variable value at given year. In this case, the variable
says if a particular event happened during the given year or not.

What I want to do is generate a new variable which would say if the
event happened at least one time during the five years preceding the
current one. So if I call this new variable v2, I'd like to obtain :

id      year     variable      v2
001     2000     0             0
001     2001     0             0
001     2002     1             1
001     2003     0             1
002     1996     0             0
002     1997     0             0
002     1998     1             1
002     1999     0             1
002     2000     0             1

Currently I manage to achieve this with two nested for loops, but it
is *very* slow and inefficient. So I wondered if there is a better way
to do this.

Thanks in advance for any help.

PS : here is the code to reproduce the first sample data :

data.frame(id=c("001","001","001","001","002","002","002","002","002"),
           year=c(2000,2001,2002,2003,1996,1997,1998,1999,2000),
           variable=c(0,0,1,0,0,0,1,0,0))

-- 
Julien Barnier
Groupe de recherche sur la socialisation
ENS-LSH - Lyon, France



More information about the R-help mailing list