[R] aggregate(), tapply(): Why is the order of the grouping variables not kept?

Marius Hofert marius.hofert at math.ethz.ch
Mon Mar 11 21:52:58 CET 2013


Dear expeRts,

The question is rather simple: Why does aggregate (or similarly tapply()) not keep the order of the grouping variable(s)?

Here is an example:

x <- data.frame(group = rep(LETTERS[1:2], each=10),
                year  = rep(rep(2001:2005, each=2), 2),
                value = rep(1:10, each=2))
## => sorted according to group, then year
aggregate(value ~ group + year, data=x, FUN=function(z) z[1])
## => sorted according to year, then group

I rather expected this to be the default:

aggregate(value ~ year + group, data=x, FUN=function(z) z[1])[,c(2,1,3)]
## => same order as input (grouping) variables

Same with tapply:

as.data.frame(as.table(tapply(x$value, list(x$group, x$year), FUN=function(z) z[1])))


Cheers,

Marius



More information about the R-help mailing list