[R] aggregate(...) with multiple functions

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jul 16 05:02:52 CEST 2010


On Thu, Jul 15, 2010 at 10:45 PM, Murat Tasan <mmuurr at gmail.com> wrote:
> hi all - i'm just wondering what sort of code people write to
> essentially performa an aggregate call, but with different functions
> being applied to the various columns.
>
> for example, if i have a data frame x and would like to marginalize by
> a factor f for the rows, but apply mean() to col1 and median() to
> col2.
>
> if i wanted to apply mean() to both columns, i would call:
>
> aggregate(x, list(f), mean)
>
> but to get the mean of col1 and the median of col2, i have to write
> separate tapply calls, then wrap back into a data frame:
>
> data.frame(tapply(x$col1, f, mean), tapply(x$col2, f, mean))
>
> this is a somewhat inelegant solution for data frames with potentially
> many columns.
>
> what i would like is for aggregate to take a list of functions for
> columns, something like:
>
> aggregate(x, list(f), list(mean, median))
>
>
> i'm just curious how others get around this limitation in aggregate().
> do most simply make the individual tapply() calls separately, then
> possibly wrap them back up (as done in the example above), or is there
> a more elegant solution using some function of R that i might be
> unaware of?
>

Using sqldf we can write:

> library(sqldf)
> sqldf("select Treatment, avg(conc), median(uptake) from CO2 group by Treatment")
   Treatment avg(conc) median(uptake)
1    chilled       435           19.7
2 nonchilled       435           31.3

See http://sqldf.googlecode.com for more info.



More information about the R-help mailing list