[R] Optimisation and NaN Errors using clm() and clmm()

Tue Apr 16 08:54:26 CEST 2013

On 15 April 2013 13:18, Thomas <thomasfoxley at aol.com> wrote:
>
> Dear List,
>
> I am using both the clm() and clmm() functions from the R package 'ordinal'.
>
> I am fitting an ordinal dependent variable with 5 categories to 9 continuous predictors, all of which have been normalised (mean subtracted then divided by standard deviation), using a probit link function. From this global model I am generating a confidence set of 200 models using clm() and the 'glmulti' R package. This produces these errors:
>
> /> model.2.10 <- glmulti(as.factor(dependent) ~ predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9, data = database, fitfunc = clm, link = "probit", method = "g", crit = aicc, confsetsize = 200, marginality = TRUE)
> ...
> After 670 generations:
> Best model: as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6
> Crit= 183.716706496392
> Mean crit= 202.022138576506
> Improvements in best and average IC have bebingo en below the specified goals.
> Algorithm is declared to have converged.
> Completed.
> There were 24 warnings (use warnings() to see them)
> > warnings()
> Warning messages:
> 1: optimization failed: step factor reduced below minimum
> 2: optimization failed: step factor reduced below minimum
> 3: optimization failed: step factor reduced below minimum/
> etc.
>
>
> I am then re-fitting each of the 200 models with the clmm() function, with 2 random factors (family nested within order). I get this error in a few of the re-fitted models:
>
> /> model.2.glmm.2 <- clmm(as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), link = "probit", data = database)
> > summary(model.2.glmm.2)
> >
> Cumulative Link Mixed Model fitted with the Laplace approximation
>
> formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 +
> predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 +
> predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8 + (1 | order/family)
> data: database
>
> link threshold nobs logLik AIC niter max.grad cond.H
> probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03
>
> Random effects:
> Var Std.Dev
> family:order 7.493e-11 8.656e-06
> order 1.917e-12 1.385e-06
> Number of groups: family:order 12, order 4
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> predictor_1 0.40802 0.78685 0.519 0.6041
> predictor_2 0.02431 0.26570 0.092 0.9271
> predictor_3 -0.84486 0.32056 -2.636 0.0084 **
> predictor_6 0.65392 0.34348 1.904 0.0569 .
> predictor_7 0.71730 0.29596 2.424 0.0154 *
> predictor_8 -1.37692 0.75660 -1.820 0.0688 .
> predictor_9 0.15642 0.28969 0.540 0.5892
> predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 *
> predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 ***
> predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 *
> predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 ***
> predictor_1:predictor_9 4.28519 NA NA NA
> predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 *
> predictor_3:predictor_9 -1.49790 NA NA NA
> predictor_6:predictor_9 -1.31538 NA NA NA
> predictor_7:predictor_9 -4.41998 NA NA NA
> predictor_8:predictor_9 3.99709 NA NA NA
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Threshold coefficients:
> Estimate Std. Error z value
> 0|1 -0.2236 0.3072 -0.728
> 1|2 1.4229 0.3634 3.915
> (211 observations deleted due to missingness)
> Warning message:
> In sqrt(diag(vc)[1:npar]) : NaNs produced/
>

This warning is due to a (near) singular variance-covariance matrix of
the model parameters, which in turn is due to the fact that the model
converged to a boundary solution: both random effects variance
parameters are zero. If you exclude the random terms and refit the
model with clm, the variance-covariance matrix will probably be well
defined and standard errors can be computed.

Another thing is that you are fitting 17 regression parameters and 2
random effect terms (which in the end do not count) to only 103
observations. I would be worried about overfitting or perhaps even
non-fitting. I think I would also be concerned about the 211
observations that are incomplete, and I would be careful with
automatic model selection/averaging etc. on incomplete data (though I
don't know how/if glmulti actually deals with that).

>
> I have tried a number of different approaches, each has its own problems. I have fixed these using various suggestions from online forums (eg https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015328.html, https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q2/016165.html) and this is as good as I can get it.
>
> After the first stage (generating the model set with glmulti) I tested every model in the confidence set individually - there were no errors - but there was clearly a problem during the model selection process. Should I be worried?

I don't know - I don't use glmulti or automatic model selection
regularly, so I don't know what the consequences might be.

The question seems to be what caused the potential non-convergences
for some of the models that were not chosen. If they didn't converge
because the models are not identifiable, then I suppose all is ok, but
if they are relevant models that should have converged, then there
might be a problem. However, if a model does not converge, there is
usually a good reason for it, so I am not particularly worried that
there are relevant models among those that did not converge. Without
considering a particular model, it is hard to tell why it might not
have converged, but if you can pinpoint the models that trigger the
warnings/errors, I would be happy to take further look at them.

Hope this helps,
Rune

>
> No errors appear in the top 5% of re-fitted models (which are the only ones I will be using) however I am concerned that errors may be indicative of a problem with my approach.
>
> A further worry is that the errors might be removing models that could otherwise be included.
>
>
> Any help would be much appreciated.
>
> Tom
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.