[R] error using user-defined link function with mixed models (LMER)

Douglas Bates bates at stat.wisc.edu
Sun Feb 11 14:38:34 CET 2007


Look at the 'link' component of the two lists.  In the binomial family
object the link component is a character vector of link 1.  In your
logexposure family object it is a list of length 5.

On 2/10/07, Jessi Brown <jessilbrown at gmail.com> wrote:
> Ok, I've tried checking out the structure of the binomial and
> logexposure families, the big difference  appears to be the valideta
> parameter (it's "NULL" in the logexposure family).
>
> First, binomial:
> > str(binomial())
> List of 11
>  $ family    : chr "binomial"
>  $ link      : chr "logit"
>  $ linkfun   :function (mu)
>  $ linkinv   :function (eta)
>  $ variance  :function (mu)
>  $ dev.resids:function (y, mu, wt)
>  $ aic       :function (y, n, mu, wt, dev)
>  $ mu.eta    :function (eta)
>  $ initialize:  expression({     if (NCOL(y) == 1) {         if
> (is.factor(y))              y <- y != levels(y)[1]         n <-
> rep.int(1, nobs)         if (any(y < 0 | y > 1))              stop("y
> values must be 0 <= y <= 1")         mustart <- (weights * y +
> 0.5)/(weights + 1)         m <- weights * y         if (any(abs(m -
> round(m)) > 0.001))              warning("non-integer #successes in a
> binomial glm!")     }     else if (NCOL(y) == 2) {         if
> (any(abs(y - round(y)) > 0.001))              warning("non-integer
> counts in a binomial glm!")         n <- y[, 1] + y[, 2]         y <-
> ifelse(n == 0, 0, y[, 1]/n)         weights <- weights * n
> mustart <- (n * y + 0.5)/(n + 1)     }     else stop("for the binomial
> family, y must be a vector of 0 and 1's\n",          "or a 2 column
> matrix where col 1 is no. successes and col 2 is no. failures") })
>  $ validmu   :function (mu)
>  $ valideta  :function (eta)
>  - attr(*, "class")= chr "family"
>
>
> Now, logexposure:
> > str(logexposure(link="logexp", ExposureDays=apfa4$days))
> List of 11
>  $ family    : chr "binomial"
>  $ link      :List of 5
>   ..$ linkfun :function (mu)
>   .. ..- attr(*, "source")= chr "function(mu) qlogis(mu^(1/days))"
>   ..$ linkinv :function (eta)
>   .. ..- attr(*, "source")= chr "function(eta) plogis(eta)^days"
>   ..$ mu.eta  :function (eta)
>   .. ..- attr(*, "source")= chr [1:3] "function(eta)" ...
>   ..$ valideta:function (eta)
>   .. ..- attr(*, "source")= chr "function(eta) TRUE"
>   ..$ name    : chr "logexp()"
>   ..- attr(*, "class")= chr "link-glm"
>  $ linkfun   :function (mu)
>   ..- attr(*, "source")= chr "function(mu) qlogis(mu^(1/days))"
>  $ linkinv   :function (eta)
>   ..- attr(*, "source")= chr "function(eta) plogis(eta)^days"
>  $ variance  :function (mu)
>   ..- attr(*, "source")= chr "function(mu) mu * (1 - mu)"
>  $ dev.resids:function (y, mu, wt)
>   ..- attr(*, "source")= chr [1:2] "function(y, mu, wt)
> .Call(\"binomial_dev_resids\"," ...
>  $ aic       :function (y, n, mu, wt, dev)
>   ..- attr(*, "source")= chr [1:7] "function(y, n, mu, wt, dev) {" ...
>  $ mu.eta    :function (eta)
>   ..- attr(*, "source")= chr [1:3] "function(eta)" ...
>  $ initialize:  expression({     if (NCOL(y) == 1) {         if
> (is.factor(y))              y <- y != levels(y)[1]         n <-
> rep.int(1, nobs)         if (any(y < 0 | y > 1))              stop("y
> values must be 0 <= y <= 1")         mustart <- (weights * y +
> 0.5)/(weights + 1)         m <- weights * y         if (any(abs(m -
> round(m)) > 0.001))              warning("non-integer successes in a
> binomial glm!")     }     else if (NCOL(y) == 2) {         if
> (any(abs(y - round(y)) > 0.001))              warning("non-integer
> counts in a binomial glm!")         n <- y[, 1] + y[, 2]         y <-
> ifelse(n == 0, 0, y[, 1]/n)         weights <- weights * n
> mustart <- (n * y + 0.5)/(n + 1)     }     else stop("for the binomial
> family,", " y must be a vector of 0 and 1's\n",          "or a 2
> column matrix where col 1 is", " no. successes and col 2 is no.
> failures") })
>  $ validmu   :function (mu)
>   ..- attr(*, "source")= chr "function(mu) all(mu > 0) && all(mu < 1)"
>  $ valideta  : NULL
>  - attr(*, "class")= chr "family"
>
>
> So, could this be the root of the problem?
>
> Here again is the logexp function:
> logexp <- function(days = 1)
> {
>     linkfun <- function(mu) qlogis(mu^(1/days))
>     linkinv <- function(eta) plogis(eta)^days
>     mu.eta <- function(eta)
>       days*.Call("logit_mu_eta", eta,
>        PACKAGE = "stats")*plogis(eta)^(days-1)
>     valideta <- function(eta) TRUE
>     link <- paste("logexp(", days, ")", sep="")
>     structure(list(linkfun = linkfun, linkinv = linkinv,
>        mu.eta = mu.eta, valideta = valideta, name = link),
>         class = "link-glm")
> }
>
> So, does something seem obviously wrong about the "valideta <-
> function(eta) TRUE" bit? I readily admit that I did not write these
> user-defined link and family functions (I believe they resulted from
> the combined efforts of Mark Herzog and Brian Ripley), so my clumsy
> efforts to toy with the valideta portion of the logexp link function
> aren't working.
>
> Thanks again in advance for any further advice.
>
> cheers, Jessi Brown
>
> On 2/10/07, Douglas Bates <bates at stat.wisc.edu> wrote:
> > On 2/10/07, Jessi Brown <jessilbrown at gmail.com> wrote:
> > > Greetings, everyone. I've been trying to analyze bird nest survival
> > > data using generalized linear mixed models (because we documented
> > > several consecutive nesting attempts by the same individuals; i.e.
> > > repeated measures data) and have been unable to persuade the various
> > > GLMM models to work with my user-defined link function. Actually,
> > > glmmPQL seems to work, but as I want to evaluate a suite of competing
> > > models, I'd like to use ML or REML estimation methods in order to end
> > > up with meaningful log-likelihoods.
> > >
> > > Here's the link function I use:
> > > logexp <- function(days = 1)
> > > {
> > >     linkfun <- function(mu) qlogis(mu^(1/days))
> > >     linkinv <- function(eta) plogis(eta)^days
> > >     mu.eta <- function(eta)
> > >       days*.Call("logit_mu_eta", eta,
> > >        PACKAGE = "stats")*plogis(eta)^(days-1)
> > >     valideta <- function(eta) TRUE
> > >     link <- paste("logexp(", days, ")", sep="")
> > >     structure(list(linkfun = linkfun, linkinv = linkinv,
> > >        mu.eta = mu.eta, valideta = valideta, name = link),
> > >         class = "link-glm")
> > > }
> > >
> > > # Modified binomial family function (that allows logexp link function)
> > > logexposure<-function (link="logexp",ExposureDays) {
> > >     variance <- function(mu) mu * (1 - mu)
> > >     validmu <- function(mu) all(mu > 0) && all(mu < 1)
> > >     dev.resids <- function(y, mu, wt) .Call("binomial_dev_resids",
> > >         y, mu, wt, PACKAGE = "stats")
> > >     aic <- function(y, n, mu, wt, dev) {
> > >         m <- if (any(n > 1))
> > >             n
> > >         else wt
> > >         -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m *
> > >             y), round(m), mu, log = TRUE))
> > >     }
> > >     initialize <- expression({
> > >         if (NCOL(y) == 1) {
> > >             if (is.factor(y)) y <- y != levels(y)[1]
> > >             n <- rep.int(1, nobs)
> > >             if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1")
> > >             mustart <- (weights * y + 0.5)/(weights + 1)
> > >             m <- weights * y
> > >             if (any(abs(m - round(m)) > 0.001))
> > >                  warning("non-integer successes in a binomial glm!")
> > >         } else if (NCOL(y) == 2) {
> > >             if (any(abs(y - round(y)) > 0.001))
> > >                 warning("non-integer counts in a binomial glm!")
> > >             n <- y[, 1] + y[, 2]
> > >             y <- ifelse(n == 0, 0, y[, 1]/n)
> > >             weights <- weights * n
> > >             mustart <- (n * y + 0.5)/(n + 1)
> > >         } else stop("for the binomial family,",
> > >                 " y must be a vector of 0 and 1's\n",
> > >             "or a 2 column matrix where col 1 is",
> > >             " no. successes and col 2 is no. failures")
> > >     })
> > >     structure(list(family="binomial", link=logexp(ExposureDays),
> > >        linkfun=logexp(ExposureDays)$linkfun,
> > >        linkinv=logexp(ExposureDays)$linkinv, variance=variance,
> > >        dev.resids=dev.resids, aic=aic,
> > >        mu.eta=logexp(ExposureDays)$mu.eta, initialize=initialize,
> > >        validmu=validmu, valideta=logexp$valideta), class = "family")
> > > }
> > >
> > >
> > >
> > > Now, here's how it works in a GLM:
> > >
> > > > apfa.glm.1<-glm(Success~MeanAge+I(MeanAge^2), family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> > > > summary(apfa.glm.1)
> > >
> > > Call:
> > > glm(formula = Success ~ MeanAge + I(MeanAge^2), family =
> > > logexposure(link = "logexp",
> > >     ExposureDays = apfa4$Days), data = apfa4)
> > >
> > > Deviance Residuals:
> > >     Min       1Q   Median       3Q      Max
> > > -3.1525   0.2802   0.3637   0.4291   0.7599
> > >
> > > Coefficients:
> > >                Estimate Std. Error z value Pr(>|z|)
> > > (Intercept)   5.5594830  0.6085542   9.136   <2e-16 ***
> > > MeanAge      -0.0908251  0.0407218  -2.230   0.0257 *
> > > I(MeanAge^2)  0.0014926  0.0006104   2.445   0.0145 *
> > > ---
> > > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> > >
> > > (Dispersion parameter for binomial family taken to be 1)
> > >
> > >     Null deviance: 323.58  on 661  degrees of freedom
> > > Residual deviance: 285.65  on 659  degrees of freedom
> > > AIC: 291.65
> > >
> > > Number of Fisher Scoring iterations: 6
> > >
> > >
> > >
> > > Next, here's the results of a glmmPQL run:
> > >
> > > > apfa.glmm.1<-glmmPQL(Success~MeanAge+I(MeanAge^2), random=~1|Territory, family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> > > iteration 1
> > > > summary(apfa.glmm.1)
> > > Linear mixed-effects model fit by maximum likelihood
> > >  Data: apfa4
> > >   AIC BIC logLik
> > >    NA  NA     NA
> > >
> > > Random effects:
> > >  Formula: ~1 | Territory
> > >          (Intercept) Residual
> > > StdDev: 0.0003431913 1.051947
> > >
> > > Variance function:
> > >  Structure: fixed weights
> > >  Formula: ~invwt
> > > Fixed effects: Success ~ MeanAge + I(MeanAge^2)
> > >                  Value Std.Error  DF   t-value p-value
> > > (Intercept)   5.559466 0.6416221 624  8.664705  0.0000
> > > MeanAge      -0.090824 0.0429346 624 -2.115397  0.0348
> > > I(MeanAge^2)  0.001493 0.0006436 624  2.319090  0.0207
> > >  Correlation:
> > >              (Intr) MeanAg
> > > MeanAge      -0.927
> > > I(MeanAge^2)  0.826 -0.968
> > >
> > > Standardized Within-Group Residuals:
> > >         Min          Q1         Med          Q3         Max
> > > -11.3646020   0.1901969   0.2485473   0.2951632   0.5499915
> > >
> > > Number of Observations: 662
> > > Number of Groups: 36
> > >
> > >
> > >
> > > Finally, here's what happens when I try to run an LMER model (same
> > > error messages no matter which estimation method I choose):
> > >
> > > > apfa.lmer.1<-lmer(Success~MeanAge+I(MeanAge^2)+(1|Territory), data=apfa4, family=logexposure(link="logexp", ExposureDays=apfa4$Days), method="Laplace")
> > > > summary(apfa.lmer.1)
> > > Error in if (any(sd < 0)) return("'sd' slot has negative entries") :
> > >         missing value where TRUE/FALSE needed
> > > > names(apfa.lmer.1)
> > > NULL
> >
> > > So, does anyone have any idea as to whether the problem is in the
> > > user-defined link function as written, or have any thoughts about how
> > > to get around this problem? If LMER and LME can't do it, could there
> > > be some way to trick the glmmML function into accepting the
> > > user-defined link function?
> >
> > lmer is designed to work with arbitrary families but I haven't done a
> > lot of testing outside the binomial and poisson families.
> >
> > Try looking at the structure of an instance of your family and
> > comparing it to, say,
> >
> > str(binomial())
> >
> > Make sure that all the components are there and have the correct form.
> >
> > The next step is to use verbose output so you can see the progress of
> > the iterations.  Add the optional argument control = list(msVerbose =
> > 1) to your call to lmer.
> >
> > Instead of immediately requesting a summary, use str(apfa.lmer.1) to
> > check the structure.  Again you may want to compare this description
> > to that from str applied to a similar fit using the binomial family.
> >
>



More information about the R-help mailing list