[R] how calculate mean for each group

Marc Schwartz MSchwartz at medanalytics.com
Thu Oct 2 16:37:15 CEST 2003


On Thu, 2003-10-02 at 08:47, Spencer Graves wrote:
> An alternative to renaming columns in the ouput of aggregate is to 
> provide names in the "by" list as follows: 
> 
> aggregate(df$treatment, list(gp=df$group, dup=df$duplicate), mean)
> 
> hope this helps.  spencer graves

SNIP

Spencer,

Yeah, knew that. Using the above you would get:

> aggregate(df$treatment, list(gp=df$group, dup=df$duplicate), mean)
  gp dup   x
1  A   N 6.0
2  B   N 4.0
3  A   Y 4.0
4  B   Y 1.5

Which still leaves the mean column generically labeled as 'x'.

To take it one more step, given the way in which aggregate.data.frame is
coded and the way in which df$treatment is passed as a vector, you could
use the following to label the mean column as 'treatment':

df <- data.frame(group = c(rep("A", 3), rep("B", 3)),
                 duplicate = c("Y", "Y", "N", "Y", "N", "Y"),
                 treatment = c(5, 3, 6, 2, 4, 1))

attach(df)

aggregate(as.data.frame(treatment), 
          list(group = group, duplicate = duplicate), mean)

which yields:

  group duplicate treatment
1     A         N       6.0
2     B         N       4.0
3     A         Y       4.0
4     B         Y       1.5


Remember to 'detach(df)'.

Doing it this way, 'treatment' retains the name attribute when passed to
aggregate, rather than as a vector.

Thanks for pointing that out.

Regards,

Marc




More information about the R-help mailing list