[R] reshape: cast(x, a ~ b ~ .) vs. cast(x, a ~ b) difference

Philip Kensche pkensche at cmbi.ru.nl
Fri Jun 24 10:01:47 CEST 2011


Hi,

I have a problem with understanding what the function cast() from the package reshape is doing. In the example below I have a 2x2x2 array which I first melt and then cast to get the averages over the field 'strain' for every combination of the fields 'treatement' and 'gene':

------
> mdat <- melt(array(rnorm(8), dim=c(2,2,2),
               dimnames=list(strain=c("s1", "s2"),
                             treatment=c("t1", "t2"),
                             gene=c("g1", "g2"))))
> mdat
  strain treatment gene       value
1     s1        t1   g1 -0.39282103
2     s2        t1   g1 -1.36823389
3     s1        t2   g1  0.60153033
4     s2        t2   g1 -0.07080973
5     s1        t1   g2 -1.05012093
6     s2        t1   g2  0.31942129
> cast(mdat, treatment ~ gene, fun.aggregate=mean)
  treatment         g1         g2
1        t1 -0.8805275 -0.3653498
2        t2  0.2653603 -0.6248085
> cast(mdat, treatment ~ gene ~ ., fun.aggregate=mean)
, ,  = (all)

         gene
treatment         g1         g2
       t1 -0.8805275 -0.8805275
       t2 -0.3653498 -0.3653498
-----

The first cast() uses a single tilde and returns a data.frame with the correct values. I've also calculated the values with ddply and get the same result:

-----
> ddply(mdat, .(treatment, gene), function (x) mean(x$value))
  treatment gene         V1
1        t1   g1 -0.8805275
2        t1   g2 -0.3653498
3        t2   g1  0.2653603
4        t2   g2 -0.6248085
-----

The second cast() above, however only returns the averages for the fields (t1, g1) and (t1, g2) but these twice. To be honest, to me this looks like a bug ... but maybe I miss something here.

I would appreciate any enlightening comments on this!
Thanks in advance!

Philip





P.S.:

> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=de_DE.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=de_DE.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape_0.8.3       plyr_1.4            latticeExtra_0.6-11
[4] RColorBrewer_1.0-2  lattice_0.19-17    

loaded via a namespace (and not attached):
[1] grid_2.12.2  tools_2.12.2



--
  | Philip Kensche <pkensche at cmbi.ru.nl>
  | http://www.cmbi.ru.nl/~pkensche
  |
  | Center for Molecular and Biomolecular Informatics
  | http://www2.cmbi.ru.nl
  |
  | phone +31 (0)24 36 19693
  | fax   +31 (0)24 36 19395



More information about the R-help mailing list