[R] How to average subgroups in a dataframe? (not sure how to apply aggregate(..))

Karl Ove Hufthammer karl at huftis.org
Wed Oct 21 14:33:46 CEST 2009


In article <800ACFC0-2C3C-41F1-AF18-3B52F7E43F07 at jhsph.edu>, 
bcarvalh at jhsph.edu says...
> aves = aggregate(df1$score, by=list(col1=df1$col1, col2=df1$col2), mean)
> results = merge(df1, aves)

Or, with the 'plyr' package, which has a very nice syntax:

library(plyr)
ddply(df1, .(col1, col2), transform, Average=mean(score))

It may be a bit slow for very large datasets, though.

Here's an alternative, which will be as fast as the aggregate solution.

within(df1, { Average=ave(score, col1, col2, FUN=mean) } )

Which one you use is a matter of taste.

And of course, the 'within' function is not the important part here; 
'ave' is. For example, if you have attached your data frame, you just 
have to type

Average=ave(score, col1, col2, FUN=mean)

-- 
Karl Ove Hufthammer




More information about the R-help mailing list