[R] About the merge

William Dunlap wdunlap at tibco.com
Wed Mar 11 21:01:35 CET 2009


You can assign a group number to each run of identical
values of 'Rep' with
    firstInRun <- with(data, c(TRUE, Rep[-1]!=Rep[-length(Rep)]))
    group <- cumsum(firstInRun)
where 'data' is your data.frame's name.  Once you've
assigned the runs to groups then you can use, e.g.,
sapply(split()) to compute the group sums
    summed_App_dur <- with(data, sapply(split(App_dur,group), sum))
and put them into the shortened data.frame with
    newdata <- data[firstInRun,]
    newdata$App_dur <- summed_App_dur 

There are various tricks for speeding up the sapply(split(),sum),
if that is a problem.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

-------------------------------------------------------------
         Date    Dtime Hour Min Second Rep App_dur
9  2006-02-22 14:36:11   14  36     11   4       1
10 2006-02-22 14:36:12   14  36     12   3      86
11 2006-02-22 14:37:38   14  37     38   0      58
14 2006-02-22 14:38:36   14  38     36   3       1
15 2006-02-22 14:38:37   14  38     37   4       1
16 2006-02-22 14:38:38   14  38     38   1       9
18 2006-02-22 14:38:47   14  38     47   0       3
20 2006-02-22 14:38:50   14  38     50   1       1
21 2006-02-22 14:38:51   14  38     51   4       2   ***
23 2006-02-22 14:38:53   14  38     53   4      39  ***
25 2006-02-22 14:39:32   14  39     32   3       1
26 2006-02-22 14:39:33   14  39     33   4       1
27 2006-02-22 14:39:34   14  39     34   3       8
28 2006-02-22 14:39:42   14  39     42   4      62

How to write a program to merge Rep with  consecutive equal numbers to
one of that number and sum up App_dur??

on the above data frame, i want to merge
21 2006-02-22 14:38:51   14  38     51   4       2   ***

23 2006-02-22 14:38:53   14  38     53   4      39  ***

to 2006-02-22 14:38:51   14  38     51   4       41 
...
Thanks.

Tammy




More information about the R-help mailing list