[R] F-values in nested designs

Marcus Nunes marcus.nunes at gmail.com
Wed Oct 5 16:51:19 CEST 2011


Dennis,

thanks for your help. I've read your email and the references you gave
and things are more clear to me.

Best,

Marcus



On Tue, Oct 4, 2011 at 19:28, Dennis Murphy <djmuser at gmail.com> wrote:
>
> Hi:
>
> > INB4: if I have a nested design with treatment A and treatment B
> > within A, F-values are MSA/MSA(B) and MSA(B)/MSE, correct? How can I
> > make R give these values directly, without further coding?
>
> This is how to get an equivalent model in lme4, but it probably isn't
> what you expect (particularly the 'without further coding' part).
> Using your example,
>
> library('lme4')
> dn <- data.frame( y = c(10, 12, 8, 13, 14, 8, 10, 12,
>                        9, 10, 12, 11, 11, 13, 9, 10, 14,
>                        11, 10, 9, 8, 9, 8, 8, 13, 14, 7,
>                        10, 10, 13, 9, 7, 16, 12, 5, 4),
>                 areas = factor(rep(c("m1", "m2", "m3"), each=12)),
>                 sites = factor(rep(1:4, 9)))
> > a <- lmer(y ~ areas + (1 | areas:sites), data = dn)
> > a
> Linear mixed model fit by REML
> Formula: y ~ areas + (1 | areas:sites)
>   Data: dn
>   AIC BIC logLik deviance REMLdev
>  171.1 179 -80.56      167   161.1
> Random effects:
>  Groups      Name        Variance Std.Dev.
>  areas:sites (Intercept) 3.25     1.8028
>  Residual                4.50     2.1213
> Number of obs: 36, groups: areas:sites, 12
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)   10.750      1.090   9.865
> areasm2       -0.750      1.541  -0.487
> areasm3       -0.750      1.541  -0.487
>
> Correlation of Fixed Effects:
>        (Intr) aresm2
> areasm2 -0.707
> areasm3 -0.707  0.500
> ##------
>
> lme4 reports the estimated variance components and their square roots,
> the standard deviation components (Std.Dev). The estimated residual
> variance component is 4.5, which is the same as the residual MSE from
> the Minitab output. The estimated variance component associated with
> sites nested within areas (areas:sites) is 3.25. Since the design is
> balanced, the expected mean square of this term (assuming the model
> assumptions are correct) is $\sigma_e^2 + 3 \sigma_s^2$, which is
> estimated by 4.5 + 3(3.25) = 14.25, the observed mean square for sites
> within areas, again coinciding with the Minitab output. However,
> lmer() does not report the result of an F-test for the 'significance'
> of the sites variance component, because the null hypothesis
> $\sigma_s^2 = 0$ is on the boundary of the parameter space and there
> are questions about the reliability of p-values for such tests. See
> http://rwiki.sciviews.org/doku.php?id=guides:lmer-tests   In other
> words, don't accept the reported p-value re the sites variance from
> Minitab on faith. This answers (even in multi-stratum models in aov()
> )
>
> > 2) why I don't have an F-value for the nested effect? I realize that R
> > call it as Residuals in the first part of the summary, but there is a
> > way to make R consider it s another factor?
>
> To get the fixed effects part of Minitab's ANOVA table with lmer(),
> > anova(a)
> Analysis of Variance Table
>      Df Sum Sq Mean Sq F value
> areas  2 1.4208 0.71039  0.1579
>
> Once again, the p-value is not reported (by design). Assuming that the
> specified normal-theory based model is correct, the conventional F
> test for testing the null hypothesis of equal area means would be the
> mean square ratio of areas to sites, which would have an F(2, 9)
> distribution under the null hypothesis. The p-value of that test would
> be
>
> > 1 - pf(0.1579, 2, 9)
> [1] 0.85625
>
> Apart from the needless test of the sites within areas variance
> component, the lmer() output corresponds to that of the Minitab table.
> The output from lmer() gives you the capacity to do much more, but it
> helps to understand some of the theory behind mixed models first.
>
> The transition from fixed effects ANOVA to random effects and mixed
> models is not a smooth one - multiple sources of random variation
> complicate both testing and confidence/prediction interval procedures,
> with several messages on R-sig-mixed-models (including the one cited
> above) discussing such issues at length.
>
> As I said, this is probably not what you expected.
>
> Dennis
>
>
>
>
> On Tue, Oct 4, 2011 at 11:17 AM, Marcus Nunes <marcus.nunes at gmail.com> wrote:
> > Hello all
> >
> > I'm trying to learn how to fit a nested model in R. I found a toy
> > example on internet where a dataset that have 3 areas and 4 sites
> > within these areas. When I use Minitab to fit a nested model to this
> > data, this is the ANOVA table that I got:
> >
> > Nested ANOVA: y versus areas, sites
> >
> > Analysis of Variance for y
> > Source  DF        SS       MS      F      P
> > areas    2    4.5000   2.2500  0.158  0.856
> > sites    9  128.2500  14.2500  3.167  0.012
> > Error   24  108.0000   4.5000
> > Total   35  240.7500
> >
> > When I use R, this is the ANOVA table that I got:
> >
> > summary(aov(y ~ areas + Error(areas%in%sites)))
> >
> > Error: areas:sites
> >          Df Sum Sq Mean Sq F value Pr(>F)
> > areas      2   4.50    2.25  0.1579 0.8563
> > Residuals  9 128.25 14.25
> >
> > Error: Within
> >          Df Sum Sq Mean Sq F value Pr(>F)
> > Residuals 24    108     4.5
> > Warning message:
> > In aov(y ~ areas + Error(areas %in% sites)) : Error() model is singular
> >
> > The results are the same, except for one F-value and I don't
> > understand why. Hence, these are my questions:
> >
> > 1) I searched google and I can't find a reason to have this warning in
> > my code. Why is this happening?
> >
> > 2) why I don't have an F-value for the nested effect? I realize that R
> > call it as Residuals in the first part of the summary, but there is a
> > way to make R consider it s another factor?
> >
> > INB4: if I have a nested design with treatment A and treatment B
> > within A, F-values are MSA/MSA(B) and MSA(B)/MSE, correct? How can I
> > make R give these values directly, without further coding?
> >
> > Thanks for your help.
> >
> > Below is my code and information about my system.
> > ----------------------
> > y = c(10, 12, 8, 13, 14, 8, 10, 12, 9, 10, 12, 11, 11, 13, 9, 10, 14,
> > 11, 10, 9, 8, 9, 8, 8, 13, 14, 7, 10, 10, 13, 9, 7, 16, 12, 5, 4)
> > areas = as.factor(rep(c("m1", "m2", "m3"), each=12))
> > #sites = as.factor(c(rep(c(1, 2, 3, 4), 3), rep(c(5, 6, 7, 8), 3),
> > rep(c(9, 10, 11, 12), 3)))
> > sites = as.factor(c(rep(c(1, 2, 3, 4), 9)))
> > repl  = as.factor(rep(c(1, 2, 3), each=4, 3))
> >
> > summary(aov(y ~ areas + Error(areas%in%sites)))
> >
> > summary(aov(y ~ areas + Error(areas%in%sites)))
> > Error: areas:sites
> >           Df Sum Sq Mean Sq F value Pr(>F)
> > areas      2   4.50    2.25  0.1579 0.8563
> > Residuals  9 128.25   14.25
> > Error: Within
> >           Df Sum Sq Mean Sq F value Pr(>F)
> > Residuals 24    108     4.5
> > Warning message:
> > In aov(y ~ areas + Error(areas %in% sites)) : Error() model is singular
> >
> >
> >
> > sessionInfo()
> > R version 2.13.1 Patched (2011-08-25 r56798)
> > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> >
> > locale:
> > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> >
> > attached base packages:
> > [1] splines   stats     graphics  grDevices utils     datasets  methods
> > [8] base
> >
> > other attached packages:
> > [1] car_2.0-11         survival_2.36-9    nnet_7.3-1
> > [4] MASS_7.3-14        lme4_0.999375-40   Matrix_0.999375-50
> > [7] lattice_0.19-33    nlme_3.1-102
> >
> > loaded via a namespace (and not attached):
> > [1] grid_2.13.1   stats4_2.13.1 tools_2.13.1
> > --
> > Marcus Nunes
> > marcus.nunes at gmail.com
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >



--
Marcus Nunes
marcus.nunes at gmail.com



More information about the R-help mailing list