[R] Problem With Model.Tables Function

Gary Whysong gwhysong at cactus.east.asu.edu
Sat Mar 10 20:44:26 CET 2001


I am using R for the first time in one of my classes.  My students have
alerted me to a problem for which we have not found an answer.  We find
that some means returned by the model.tables function are not correct when
missing data is present in analysis of variance problems.  We have
duplicated the problem using R 1.2.0, 1.2.1, and 1.2.2 under Windows 98
and several distributions of Linux (Redhat 7.0, Mandrake 7.2, SuSE 7.0,
and 7.1). 

The situation is best illustrated with a small example of a randomized
block design having three treatments and four blocks.

> blocks<-factor(c(1,2,3,4,1,2,3,4,1,2,3,4))
> trtmnts<-factor(c(1,1,1,1,2,2,2,2,3,3,3,3))
> data<-c(10,12,9,11,13,15,11,16,18,22,17,19)
> balanced<-aov(data~blocks+trtmnts)
> summary(balanced)
            Df  Sum Sq Mean Sq F value    Pr(>F)
blocks       3  28.250   9.417  10.273  0.008868 **
trtmnts      2 147.167  73.583  80.273 4.676e-05 ***
Residuals    6   5.500   0.917
---
Signif. codes:  0  `***'  0.001  `**'  0.01  `*'  0.05  `.'  0.1  ` '  1
> model.tables(balanced,"means")
Tables of means
Grand mean
 
14.41667
 
 blocks
     1      2      3      4
13.667 16.333 12.333 15.333
 
 trtmnts
    1     2     3
10.50 13.75 19.00
 
Entering the data again and dropping treatment 2, block3 and treatment 3,
block 4, we have:
 
> blocks2<-factor(c(1,2,3,4,1,2,4,1,2,3))
> trtmts2<-factor(c(1,1,1,1,2,2,2,3,3,3,))
> data2<-c(10,12,9,11,13,15,16,18,22,17)
> unbalanced<-aov(data2~blocks2+trtmts2)
> summary(unbalanced)
            Df  Sum Sq Mean Sq F value    Pr(>F)
blocks2      3  18.267   6.089  7.4341 0.0410993 *
trtmts2      2 126.557  63.279 77.2587 0.0006367 ***
Residuals    4   3.276   0.819
---
Signif. codes:  0  `***'  0.001  `**'  0.01  `*'  0.05  `.'  0.1  ` '  1
> model.tables(unbalanced,"means")
Tables of means
Grand mean
 
14.3
 
 blocks2
        1     2  3    4
    13.67 16.33 13 13.5
rep  3.00  3.00  2  2.0
 
 trtmts2
        1     2     3
    10.68 14.47 18.97
rep  4.00  3.00  3.00
 
We find that the treatment means (trtmts2) are incorrect although the
number of replications indicated are correct. Block means (blocks2) are
correct.
 
The treatment means should be: 10.5, 14.67, and 19.0, respectively.

Further investigation reveals that we encounter this problem whenever
dealing with unequal replications or missing data.  For example, with
unequal subsamples, or missing data in factorial experiments.  We can get
the correct means by using regression techniques (lm) to solve the
analysis of variance problems and extracting the fitted values from the
appropriate lm model. 

Since I am learning R, perhaps I have missed something?  Is this possibly
a bug in the model.tables function?

------------------------------------------------------------
   Gary Whysong, Associate Professor, Environmental Resources
   Morrison School of Agribusiness & Resource Management
   Arizona State University East
   Phone: (480) 727-1263, E-mail: gwhysong at Cactus.east.asu.edu
------------------------------------------------------------


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list