[R] interpreting glmer results

Mon Oct 5 22:13:22 CEST 2009

On Mon, Oct 5, 2009 at 11:57 AM, Bert Gunter <gunter.berton at gene.com> wrote:
[snip]
> -- ... and if the correlations are "high" it tells you that your model may
> be near unidentifiable = the model parameters may not be effectively
> estimated from the data. To understand what "high", "near" and "effectively"
> may mean for your data, CYLS ("Consult your local statistician")
>
> (If you really wish to use sophisticated tools like glmer, you really need
> to understand what you're doing. There is no guarantee of immunity from the
> consequences of ignorance.)

Indeed.  I was simply trying to answer the 'basic' linear models
questions and staying away from judgements.  However, since you bring
it up, I'll go ahead and climb up on a soapbox ;-)

\begin{rant}

I think it is a problem in ecology (and I'm sure other fields) that
there is huge demand for tools allowing inferences about complex
systems, yet very few have the skills necessary to safely use the
tools provided by statisticians.  For example, a common situation for
an ecologist is to be faced with analyzing observational data with
temporal and spatial non-independence of observations, lack of
balance, lack of normality, and often zero inflation and/or
under/over-dispersion.  Reviewers know enough to understand the
problems this presents classical techniques, and therefore use of
complex tools (such as mixed models or hierarchical Bayesian models)
can become a prerequisite to getting published.  In other words,
careers depend on using tools that ecologists who spends their time
focused on ecology rather than mathematical statistics have little
hope of truly understanding.  This is certainly no jab at the
intelligence of ecologists -- it's just that when you get into areas
such as drawing inferences from a GLMM, the proportion of
statisticians, even, who understand the subtleties and pitfalls is
small, and when you throw in say zero inflation and spatially
structured covariance matrices that small proportion dwindles
drastically.

/end{rant}

So, I suppose what I should have done after mentioning the LRT was to
provide this list I sent to r-sig-ecology awhile back (with a LMM in
mind):

- LRTs aren't valid to compare REML fits with
different fixed effects because REML essentially
maximizes A'Y where E[A'Y] = 0, so changing the
fixed effects changes A' which changes the data
making the likelihoods non-comparable.
- Pinheiro and Bates (2000, pg 87-88) recommend
LRTs with the standard X^2 distribution not be
used to compare ML fits with different fixed effects
because the tests can be very "anticonservative",
particularly as the number of parameters being
removed becomes large relative to the number of
observations.
- LRTs for differences in the random part of the
model when the fixed effects are the same can be
conservative due to the null value of 0 being on
the edge of the variance parameter space.
- It seems the issue of counting the number of
parameters being estimated will be an issue when
comparing models that differ in their random
effects.

best,

Kingsford Jones

>
> -- Bert
>
> hth,
>
> Kingsford
>
>
>
>> Many thanks for any help.
>>
>> Cheers,
>> Umesh Srinivasan,
>> Bangalore, India
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>