[R] Example of cell means model

Francisco Vergara gerifalte28 at hotmail.com
Wed Oct 15 20:17:03 CEST 2003


Thanks a lot for your reply. This helps a lot!
Just to confirm, using lm this model will give me the mean yield value for 
each cell in the two way array.  Now if I want to obtain the mean
of group means (like a SS type III approach) using LME (since I have random 
effects in the model) how can I parametrize this?
I could definitivelly use xtabs in a two-way case but in my case I have 2 
other (continuous) covariates that are potential confounders in
the model so I need to keep them to obtain the corrected means.
I added a continuous variable (NewVar) to the dataset Newxmp11.07 and 
obtaned a model with the covariate (fm4) and another without it (fm3)

>Newxmp11.07<-fix(xmp11.07)
>str(Newxmp11.07)
`data.frame':   36 obs. of  4 variables:
$ Yield  : num  10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
$ Variety: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
$ Density: Factor w/ 4 levels "1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
$ NewVar : num  10 9 7 12 11 11 12 11 15 16 ...

>fm3<-gls(Yield~-1+Variety + Density, xmp11.07)
>fm4<-gls(Yield~-1+Variety + Density+NewVar, Newxmp11.07)
>fm3
Coefficients:
Variety1  Variety2  Variety3  Density2  Density3  Density4
8.922222  9.797222 15.713889  2.911111  4.300000  2.433333

Degrees of freedom: 36 total; 30 residual
Residual standard error: 1.239243

>fm4
Coefficients:
   Variety1    Variety2    Variety3    Density2    Density3    Density4
8.75757265  9.70589316 15.43347009  2.88152564  4.32186752  2.40117521
     NewVar
0.01157692

Degrees of freedom: 36 total; 29 residual
Residual standard error: 1.252991

fm4 gives me the mean of the group means for all the varieties but 
apparently it gives me the treatment contrasts for the densities.  If I 
change the order of the factors in the model specification I get

>coef(fm5<-gls(Yield~-1+ Density+Variety+NewVar, Newxmp11.07))
   Density1    Density2    Density3    Density4    Variety2    Variety3
8.75757265 11.63909829 13.07944017 11.15874786  0.94832051  6.67589744
     NewVar
0.01157692

This, just like fm4 will include the original intercept value in Density 1 
which is not the actual density 1 mean.  What am I missing? I am sorry if 
these questions are very basic but I want to make sure that I understand 
what I am doing. I guess that this is the price that I am paying for having 
used in the past packages like SAS where you just ask for lsmeans and the 
software will give you a "black box" answer!

Best regards

Francisco


>From: Douglas Bates <bates at stat.wisc.edu>
>To: "Francisco Vergara" <gerifalte28 at hotmail.com>
>CC: r-help at r-project.org
>Subject: Re: [R] Example of cell means model
>Date: 15 Oct 2003 08:40:33 -0500
>
>This is an example from chapter 11 of the 6th edition of Devore's
>engineering statistics text.  It happens to be a balanced data set in
>two factors but the calculations will also work for unbalanced data.
>
>I create a factor called 'cell' from the text representation of the
>Variety level and the Density level using '/' as the separator
>character. The coefficients for the linear model 'Yield ~ cell - 1'
>are exactly the cell means.
>
> > library(Devore6)
> > data(xmp11.07)
> > str(xmp11.07)
>`data.frame':	36 obs. of  3 variables:
>  $ Yield  : num  10.5 9.2 7.9 12.8 11.2 13.3 12.1 12.6 14 10.8 ...
>  $ Variety: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 1 ...
>  $ Density: Factor w/ 4 levels "1","2","3","4": 1 1 1 2 2 2 3 3 3 4 ...
> > xmp11.07$cell = with(xmp11.07, factor(paste(Variety, Density, sep = 
>'/')))
> > xtabs(~ Variety + Density, xmp11.07)
>        Density
>Variety 1 2 3 4
>       1 3 3 3 3
>       2 3 3 3 3
>       3 3 3 3 3
> > means = xtabs(Yield ~ Variety+Density, xmp11.07)/xtabs(~ Variety + 
>Density, xmp11.07)
> > means
>        Density
>Variety 1         2         3         4
>       1  9.200000 12.433333 12.900000 10.800000
>       2  8.933333 12.633333 14.500000 12.766667
>       3 16.300000 18.100000 19.933333 18.166667
> > coef(fm1 <- lm(Yield ~ cell - 1, xmp11.07))
>   cell1/1   cell1/2   cell1/3   cell1/4   cell2/1   cell2/2   cell2/3   
>cell2/4
>  9.200000 12.433333 12.900000 10.800000  8.933333 12.633333 14.500000 
>12.766667
>   cell3/1   cell3/2   cell3/3   cell3/4
>16.300000 18.100000 19.933333 18.166667
>
>The residual sum of squares for this model is the same as that for the
>model with crossed terms
>
> > deviance(fm1)
>[1] 38.04
> > deviance(fm2 <- lm(Yield ~ Variety * Density, xmp11.07))
>[1] 38.04
>
>but the coefficients are quite different because they represent a
>different parameterization.
>
> > coef(fm2)
>       (Intercept)          Variety2          Variety3          Density2
>        9.20000000       -0.26666667        7.10000000        3.23333333
>          Density3          Density4 Variety2:Density2 Variety3:Density2
>        3.70000000        1.60000000        0.46666667       -1.43333333
>Variety2:Density3 Variety3:Density3 Variety2:Density4 Variety3:Density4
>        1.86666667       -0.06666667        2.23333333        0.26666667
>
>I hope this answers your question.  Sorry for the delay in
>responding to you.
>
>"Francisco Vergara" <gerifalte28 at hotmail.com> writes:
>
> > Many thanks for your reply.  An example would be very helpful to make
> > sure that I understand correctly what you described.

_________________________________________________________________
Surf and talk on the phone at the same time with broadband Internet access. 
Get high-speed for as low as $29.95/month (depending on the local service 
providers in your area).  https://broadband.msn.com




More information about the R-help mailing list