[R] mixed-effects models with (g)lmer in R and model selection

Wilbert Heeringa wjheeringa at gmail.com
Fri Feb 19 14:01:11 CET 2016

Dear all,

Mixed-effects models are wonderful for analyzing data, but it is always a
hassle to find the best model, i.e. the model with the lowest AIC,
especially when the number of predictor variables is large.

Presently when trying to find the right model, I perform the following


   Start with a model containing all predictors. Assuming dependent
   variable X and predictors A, B, C, D, E, I start with: X~A+B+C+D+E

   Lmer warns that is has dropped columns/coefficients. These are variables
   which have a *perfect* correlation with any of the other variables or
   with a combination of variables. With summary() it can be found which
   columns have been dropped. Assume predictor D has been dropped, I continue
   with this model: X~A+B+C+E

   Subsequently I need to check whether there are variables (or groups of
   variables) which *strongly* corrrelate to each other. I included the
   function vif.mer (developed by Austin F. Frank and available at:
   https://raw.github.com/aufrank/R-hacks/master/mer-utils.R) in my script,
   and when applying this function to my reduced model, I got vif values for
   each of the variables. When vif>5 for a predictor, it probably should be
   removed. In case multiple variables have a vif>5, I first remove the
   predictor with the highest vif, then re-run lmer en vif.mer. I remove again
   the predictor with highest vif (if one or more predictors have still a
   vif>5), and I repeat this until none of the remaining predictors has a
   vif>5. In case I got a warning "Model failed to converge" in the larger
   model(s), this warning does not appear any longer in the 'cleaned' model.

   Assume the following predictors have survived: A, B en E. Now I want to
   find the combination of predictors that gives the smallest AIC. For three
   predictors it is easy to try all combinations, but if it would have been 10
   predictors, manually trying all combinations would be time-consuming. So I
   used the function fitLMER.fnc from the LMERConvenienceFunctions package.
   This function back fit fixed effects, forward fit random effects, and
   re-back fit fixed effects. I consider the model given by fitLMER.fnc as the
   right one.

I am not an expert in mixed-effects models and have struggled with model
selection. I found the procedure which I decribed working, but I would
really be appreciate to hear whether the procedure is sound, or whether
there are better alternatives.



	[[alternative HTML version deleted]]

More information about the R-help mailing list