[R] sum specific rows in a data frame
    Chuck 
    vijay.nori at gmail.com
       
    Thu Apr 15 03:16:25 CEST 2010
    
    
  
Depending on the size of the dataframe and the operations you are
trying to perform, aggregate or ddply may be better.  In the function
below, df has the same structure as your dataframe.
Check out this code which runs aggregate and ddply for different
dataframe sizes.
============================
require(plyr)
CompareAggregation <- function(n) {
    df = data.frame(id=c(rep("A",15*n), rep("B",10*n), rep("C",
20*n)))
    df$fltval = rnorm(nrow(df))
    df$intval = rbinom(nrow(df), 1000, 0.8)
    t1 <- system.time(zz1 <- aggregate(list(fltsum=df$fltval,intsum=df
$intval), list(id=df$id), sum))
    t2 <- system.time(zz2 <- ddply(df, .(id), function(x) c(sum(x
$fltval), sum(x$intval)) ))
    return(c(agg=t1[[1]],ddply=t2[[1]]))
}
z <- c(10^seq(1,5))
names(z) <- as.character(z)
res.df <- t(data.frame(lapply(z, CompareAggregation)))
print(res.df)
============================
On Apr 14, 11:43 am, "arnaud Gaboury" <arnaud.gabo... at gmail.com>
wrote:
> Thank you for your help. The best I have found is to use the ddply function.
>
> > pose
    
    
More information about the R-help
mailing list