[R] crosstab means

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Thu Oct 22 16:52:33 CEST 1998


"Heberto Ghezzo" <heberto at MEAKINS.Lan.McGill.CA> writes:

> I would like to obtain a crosstabulation of means(var, quantiles...)
> i.e. I have a data frame with Var-i, Var-j, Var-k, Var-X, var-Y
> I like to have the mean of Var-X for each combination of Var-i,Var-j.
> One solution would be:
> by(var-i,Var-j,mean(Var-x))
> but I would like it better formatted and with mean,S.Dev,n for each 
> cell?
> Does anybody have some function to do this or some ideas how to go 
> about it?

tapply() gets you a long part of the way:

> data(warpbreaks)
> attach(warpbreaks)
> tapply(breaks,list(wool,tension),mean)
         L        M        H
A 44.55556 24.00000 24.55556
B 28.22222 28.77778 18.77778
> tapply(breaks,list(wool,tension),sd)
          L        M         H
A 18.097729 8.660254 10.272671
B  9.858724 9.431036  4.893306
> tapply(breaks,list(wool,tension),length)
  L M H
A 9 9 9
B 9 9 9

The tricky bit is printing it in a nicer layout. Something in the
right direction might be:

x<- tapply(breaks,list(wool,tension),function(x)c(mean(x),sd(x),N(x)))
a<-array(c(x,recursive=T),c(3,2,3))
dimnames(a)<-c(list(c("Mean","SD","n")),dimnames(x))
aperm(a,c(1,3,2))

which gives:

, , A

            L         M        H
Mean 44.55556 24.000000 24.55556
SD   18.09773  8.660254 10.27267
n     9.00000  9.000000  9.00000

, , B

             L         M         H
Mean 28.222222 28.777778 18.777778
SD    9.858724  9.431036  4.893306
n     9.000000  9.000000  9.000000

Or you can play around with formatC like this:

> t.sd<-formatC(tapply(breaks,list(wool,tension),sd),2,10,"f")
> t.mean<-formatC(tapply(breaks,list(wool,tension),mean),2,10,"f")
> t.n<-formatC(tapply(breaks,list(wool,tension),length),0,10,"d")
> print(aperm(array(c(t.mean,t.sd,t.n),c(3,2,3),dimnames=
+ c(rev(dimnames(t.mean)),list(c("Mean","SD","n")))),c(3,1,2)),quote=F)
, , A

     L          M          H         
Mean      44.56      28.22      24.00
SD        18.10       9.86       8.66
n             9          9          9

, , B

     L          M          H         
Mean      28.78      24.56      18.78
SD         9.43      10.27       4.89
n             9          9          9

(If you don't grasp the dim and dimnames magic, don't despair: Neither
do I, I just fiddled with them till it worked... There's probably also
a more systematic approach.)

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list