[R] data prep question

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jan 16 16:04:07 CET 2011


On Sat, Jan 15, 2011 at 4:26 PM, Matthew Strother <rstrothe at gmail.com> wrote:
> I have a data set with several thousand observations across time, grouped by subject (example format below)
>
> ID              TIME    OBS
> 001             2200    23
> 001             2400    11
> 001             3200    10
> 001             4500    22
> 003             3900    45
> 003             5605    32
> 005             1800    56
> 005             1900    34
> 005             2300    23
> ...
>
> I would like to identify the first time for each subject, and then subtract this value from each subsequent time.  However, the number of observations per subject varies widely (from 1 to 20), and the intervals between times varies widely.   Is there a package that can help do this, or a loop that can be set up to evaluate ID, then calculate the values?  The outcome I would like is presented below.
> ID              TIME    OBS
> 001             0               23
> 001             200             11
> 001             1000    10
> 001             2300    22
> 003             0               45
> 003             1705    32
> 005             0               56
> 005             100             34
> 005             500             23

Since the data frame appears to be already sorted by time within ID we
can do this:

>  transform(DF, OBS = ave(OBS, ID, FUN = function(x) x - x[1]))
  ID TIME OBS
1  1 2200   0
2  1 2400 -12
3  1 3200 -13
4  1 4500  -1
5  3 3900   0
6  3 5605 -13
7  5 1800   0
8  5 1900 -22
9  5 2300 -33

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list