[R] collapsing a data frame

hadley wickham h.wickham at gmail.com
Sat Oct 13 06:53:20 CEST 2007


> >   Here's a solution that takes the first element of each factor
> > and the mean of each numeric variable.  I can imagine there
> > are more general/flexible solutions.  (One might want to
> > specify more than one summary function, or specify that
> > factors that vary within group should be dropped.)
> >
> > vtype = sapply(h,class)  ## variable types [numeric or factor]
> > vtypes = unique(vtype)   ## possible types
> > v2 = lapply(vtypes,function(z) which(vtype==z))  ## which are which?
> > cfuns = list(factor=function(z)z[1],numeric=mean)## functions to apply
> > m = mapply(function(w,f) { aggregate(h[w],list(h$BROOD),f) },
> >   v2,cfuns,SIMPLIFY=FALSE)
> > data.frame(m[[1]],m[[2]][-1])
> >
> >   My question is whether this is re-inventing the wheel.  Is there
> > some function or package that performs this task?
>
> Maybe the reshape package?  http://had.co.nz/reshape
>
> hm <- melt(h, m = "TICKS")
> cast(hm, BROOD + HEIGHT + YEAR + LOCATION ~ ., mean)
> cast(hm, BROOD + HEIGHT + LOCATION ~ YEAR, mean)
> cast(hm, BROOD ~ HEIGHT ~ YEAR, mean)
>
> You should be able to create just about any data structure you need,
> and if you can't let me know.

Oh, and you can easily use multiple summary functions too:

cast(hm, BROOD  + HEIGHT + YEAR + LOCATION ~ ., c(mean, sd, length))

Hadley

-- 
http://had.co.nz/



More information about the R-help mailing list