[R] summarizing dataframe at variable/factor levels

Afshartous, David afshart at exchange.sba.miami.edu
Thu Jul 5 18:17:55 CEST 2007


Is there an efficient way to apply say "mean" or "median" to a dataframe

according to say all combinations of two variables in the dataframe?
Below is a simple example and the outline of a "manual" solution that
will work but is not very efficient
(could also generalize this to a function).  Searched the archives and
docs but didn't see anything close to this question.


dat.ex = data.frame(  rep(c(1:6), each=6), c(rnorm(12), rnorm(12, 1),
rnorm(12, 2)), rnorm(36, 5), rep(c(1:6), 6),
rep(c("Drug1", "Drug2", "Placebo"), each=12) )
names(dat.ex) = c("patient.no", "outcome", "x", "time", "drug")

mean of first 2 time pts on Drug1:
mean.time.1.drug.1 = mean( dat.ex[dat.ex$time==1 & dat.ex$drug=="Drug1",
mean.time.2.drug.1 = mean( dat.ex[dat.ex$time==2 & dat.ex$drug=="Drug1",

dat.ex.reduced = as.data.frame(rbind(mean.time.1.drug.1,
dat.ex.reduced$Drug = c("Drug1", "Drug1")  ## add back Drug variable and
time variable
dat.ex.reduced$time = c(1,2)

More information about the R-help mailing list