[R] random effects with lme() -- comparison with lm()

Fri Jan 16 01:30:54 CET 2004

Jerome Asselin <jerome at hivnet.ubc.ca> writes:

> Hi all,
> 
> In the (very simple) example below, I have defined a random effect for
> the residuals in lme(). So the model is equivalent to a FIXED effect
> model. Could someone explain to me why lme() still gives two standard
> deviation estimates? I would expect lme() to return either:
> a) an error or a warning for having an unidentifiable model;
> b) only one standard deviation estimate.
> 
> Thank you for your time.
> Jerome Asselin
> 
> > library(nlme)
> > simdat <- data.frame(A=1:4,Y=c(23,43,11,34))
> > simdat
>   A  Y
> 1 1 23
> 2 2 43
> 3 3 11
> 4 4 34
> > lme(Y~1,data=simdat,random=~1|A)
> <...snip...>
> Random effects:
>  Formula: ~1 | A
>         (Intercept) Residual
> StdDev:    12.96007 4.860027
> <...snip...>
> > summary(lm(Y~1,data=simdat))$sigma
> [1] 13.84136
> > sqrt(12.96007^2+4.860027^2)
> [1] 13.84136

The estimates from lme are REML estimates because, as you have seen,
the sum of the estimated variances is correct and in this case only
the sum is well-defined.  Of course there are an infinite number of
other possible REML estimates and that situation is not flagged.
(BTW, I wouldn't say that this is equivalent to a fixed effects
model.  It is still a random effects model with two variance
components.  It just doesn't have well-defined estimates for those two
variance components.)

What has happened is that lme set up the optimization problem and
passed it off to one of the optimizer functions which came up with
converged estimates according to some convergence criterion.  In a
simple situation like this it is easy to determined that the estimates
are not well defined.  However, lme is designed to handle very general
model specifications and catching situations where estimates are not
well defined in the general case is difficult.  So the way we designed
lme is to take the estimates from the optimizer without doing further
analysis to see if they are consistent.

You should find that intervals() applied to your fitted model produces
huge intervals on the variance components, which is one way of
diagnosing an ill-defined or nearly ill-defined model.