[R] crosstabulation of means

Mark Myatt mark at myatt.demon.co.uk
Fri May 24 10:27:19 CEST 2002


Heberto,

You write:

>Hello, I am trying to print a crosttabulation of mean,sd,n for a
>continuous variable crossclassified by anoother/s grouping variables. I
>came up with:
>xtab2 <- function(x,g1,g2) {
> funy <- function(z)
>   list(mean(z,na.rm=T),sd(z,na.rm=T),length(z))
> aa <- by(x,list(g1,g2),funy)
> bb <- matrix(unlist(aa),nrow=3
>   ,dimnames=list(c("mean","sd","n"),
>                rep(levels(as.factor(g2)),
>                rep(length(levels(as.factor(g1))),
>                    length(levels(as.factor(g2)))) ) ))
>}
>but as you can see the labels of the columns correspond only to the
>second factor, and if I try to generalize it to 3 grouping variables the
>thing does not work.
>Any suggestions please on how to write the dimnames paragraph
>appropriately?

No ... but here is some code that, I think does what you want ... I
think I got it from a previous post to R-help and tidied it up for my
own use:

means <- function(x, i, j=1, k=1)
{
  
  FUN <- function(x)
  {
    c(mean(x, na.rm = TRUE), sd(x, na.rm = TRUE), length(x))
  }
  if ((length(unique(j)) > 1) && (length(unique(k)) > 1))
  {
    t <- tapply(x, list(i, j, k), FUN)
    a <- array(c(t ,recursive = TRUE), c(3, length(unique(i)),
length(unique(j)), length(unique(k))))
    dimnames(a) <- list(c("Mean", "SD", "N"), levels(as.factor(i)),
levels(as.factor(j)), levels(as.factor(k)))
  } 
  
  if (length(unique(j)) > 1 && length(unique(k)) == 1)
  {
    t <- tapply(x, list(i, j), FUN)
    a <- array(c(t, recursive = TRUE), c(3, length(unique(i)),
length(unique(j))))
    dimnames(a) <- list(c("Mean", "SD", "N"), levels(as.factor(i)),
levels(as.factor(j)))
   }                                 
 
  if (length(unique(j)) == 1)
  {
    t <- tapply(x, i, FUN)
    a <- array(c(t, recursive = TRUE), c(3, length(unique(i))))
    dimnames(a) <- list(c("Mean", "SD", "N"), levels(as.factor(i)))
  }

  a <- formatC(a, digits = 3, width = 10, format = "f")
  print(a, quote = FALSE)
}

data(mtcars)
means(mtcars$mpg, mtcars$gear, mtcars$carb)

, , 1

     3          4          5         
Mean     20.333     29.100     17.150
SD        1.935      5.062      2.092
N         3.000      4.000      4.000

, , 2

     3          4          5         
Mean     24.750     28.200     16.300
SD        3.961      3.111      1.054
N         4.000      2.000      3.000

, , 3

     3          4          5         
Mean     12.620     19.750     15.800
SD        2.090      1.552         NA
N         5.000      4.000      1.000

, , 4

     3          4          5         
Mean     19.700     15.000     20.333
SD           NA         NA      1.935
N         1.000      1.000      3.000

, , 6

     3          4          5         
Mean     29.100     17.150     24.750
SD        5.062      2.092      3.961
N         4.000      4.000      4.000

, , 8

     3          4          5         
Mean     28.200     16.300     12.620
SD        3.111      1.054      2.090
N         2.000      3.000      5.000

You should be able to get the dimnames() argument from that.

Mark

--
Mark Myatt


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list