[R] error using user-defined link function with mixed models (LMER)

Jessi Brown jessilbrown at gmail.com
Sat Feb 10 22:27:16 CET 2007


Ok, I've tried checking out the structure of the binomial and
logexposure families, the big difference  appears to be the valideta
parameter (it's "NULL" in the logexposure family).

First, binomial:
> str(binomial())
List of 11
 $ family    : chr "binomial"
 $ link      : chr "logit"
 $ linkfun   :function (mu)
 $ linkinv   :function (eta)
 $ variance  :function (mu)
 $ dev.resids:function (y, mu, wt)
 $ aic       :function (y, n, mu, wt, dev)
 $ mu.eta    :function (eta)
 $ initialize:  expression({     if (NCOL(y) == 1) {         if
(is.factor(y))              y <- y != levels(y)[1]         n <-
rep.int(1, nobs)         if (any(y < 0 | y > 1))              stop("y
values must be 0 <= y <= 1")         mustart <- (weights * y +
0.5)/(weights + 1)         m <- weights * y         if (any(abs(m -
round(m)) > 0.001))              warning("non-integer #successes in a
binomial glm!")     }     else if (NCOL(y) == 2) {         if
(any(abs(y - round(y)) > 0.001))              warning("non-integer
counts in a binomial glm!")         n <- y[, 1] + y[, 2]         y <-
ifelse(n == 0, 0, y[, 1]/n)         weights <- weights * n
mustart <- (n * y + 0.5)/(n + 1)     }     else stop("for the binomial
family, y must be a vector of 0 and 1's\n",          "or a 2 column
matrix where col 1 is no. successes and col 2 is no. failures") })
 $ validmu   :function (mu)
 $ valideta  :function (eta)
 - attr(*, "class")= chr "family"


Now, logexposure:
> str(logexposure(link="logexp", ExposureDays=apfa4$days))
List of 11
 $ family    : chr "binomial"
 $ link      :List of 5
  ..$ linkfun :function (mu)
  .. ..- attr(*, "source")= chr "function(mu) qlogis(mu^(1/days))"
  ..$ linkinv :function (eta)
  .. ..- attr(*, "source")= chr "function(eta) plogis(eta)^days"
  ..$ mu.eta  :function (eta)
  .. ..- attr(*, "source")= chr [1:3] "function(eta)" ...
  ..$ valideta:function (eta)
  .. ..- attr(*, "source")= chr "function(eta) TRUE"
  ..$ name    : chr "logexp()"
  ..- attr(*, "class")= chr "link-glm"
 $ linkfun   :function (mu)
  ..- attr(*, "source")= chr "function(mu) qlogis(mu^(1/days))"
 $ linkinv   :function (eta)
  ..- attr(*, "source")= chr "function(eta) plogis(eta)^days"
 $ variance  :function (mu)
  ..- attr(*, "source")= chr "function(mu) mu * (1 - mu)"
 $ dev.resids:function (y, mu, wt)
  ..- attr(*, "source")= chr [1:2] "function(y, mu, wt)
.Call(\"binomial_dev_resids\"," ...
 $ aic       :function (y, n, mu, wt, dev)
  ..- attr(*, "source")= chr [1:7] "function(y, n, mu, wt, dev) {" ...
 $ mu.eta    :function (eta)
  ..- attr(*, "source")= chr [1:3] "function(eta)" ...
 $ initialize:  expression({     if (NCOL(y) == 1) {         if
(is.factor(y))              y <- y != levels(y)[1]         n <-
rep.int(1, nobs)         if (any(y < 0 | y > 1))              stop("y
values must be 0 <= y <= 1")         mustart <- (weights * y +
0.5)/(weights + 1)         m <- weights * y         if (any(abs(m -
round(m)) > 0.001))              warning("non-integer successes in a
binomial glm!")     }     else if (NCOL(y) == 2) {         if
(any(abs(y - round(y)) > 0.001))              warning("non-integer
counts in a binomial glm!")         n <- y[, 1] + y[, 2]         y <-
ifelse(n == 0, 0, y[, 1]/n)         weights <- weights * n
mustart <- (n * y + 0.5)/(n + 1)     }     else stop("for the binomial
family,", " y must be a vector of 0 and 1's\n",          "or a 2
column matrix where col 1 is", " no. successes and col 2 is no.
failures") })
 $ validmu   :function (mu)
  ..- attr(*, "source")= chr "function(mu) all(mu > 0) && all(mu < 1)"
 $ valideta  : NULL
 - attr(*, "class")= chr "family"


So, could this be the root of the problem?

Here again is the logexp function:
logexp <- function(days = 1)
{
    linkfun <- function(mu) qlogis(mu^(1/days))
    linkinv <- function(eta) plogis(eta)^days
    mu.eta <- function(eta)
      days*.Call("logit_mu_eta", eta,
       PACKAGE = "stats")*plogis(eta)^(days-1)
    valideta <- function(eta) TRUE
    link <- paste("logexp(", days, ")", sep="")
    structure(list(linkfun = linkfun, linkinv = linkinv,
       mu.eta = mu.eta, valideta = valideta, name = link),
        class = "link-glm")
}

So, does something seem obviously wrong about the "valideta <-
function(eta) TRUE" bit? I readily admit that I did not write these
user-defined link and family functions (I believe they resulted from
the combined efforts of Mark Herzog and Brian Ripley), so my clumsy
efforts to toy with the valideta portion of the logexp link function
aren't working.

Thanks again in advance for any further advice.

cheers, Jessi Brown

On 2/10/07, Douglas Bates <bates at stat.wisc.edu> wrote:
> On 2/10/07, Jessi Brown <jessilbrown at gmail.com> wrote:
> > Greetings, everyone. I've been trying to analyze bird nest survival
> > data using generalized linear mixed models (because we documented
> > several consecutive nesting attempts by the same individuals; i.e.
> > repeated measures data) and have been unable to persuade the various
> > GLMM models to work with my user-defined link function. Actually,
> > glmmPQL seems to work, but as I want to evaluate a suite of competing
> > models, I'd like to use ML or REML estimation methods in order to end
> > up with meaningful log-likelihoods.
> >
> > Here's the link function I use:
> > logexp <- function(days = 1)
> > {
> >     linkfun <- function(mu) qlogis(mu^(1/days))
> >     linkinv <- function(eta) plogis(eta)^days
> >     mu.eta <- function(eta)
> >       days*.Call("logit_mu_eta", eta,
> >        PACKAGE = "stats")*plogis(eta)^(days-1)
> >     valideta <- function(eta) TRUE
> >     link <- paste("logexp(", days, ")", sep="")
> >     structure(list(linkfun = linkfun, linkinv = linkinv,
> >        mu.eta = mu.eta, valideta = valideta, name = link),
> >         class = "link-glm")
> > }
> >
> > # Modified binomial family function (that allows logexp link function)
> > logexposure<-function (link="logexp",ExposureDays) {
> >     variance <- function(mu) mu * (1 - mu)
> >     validmu <- function(mu) all(mu > 0) && all(mu < 1)
> >     dev.resids <- function(y, mu, wt) .Call("binomial_dev_resids",
> >         y, mu, wt, PACKAGE = "stats")
> >     aic <- function(y, n, mu, wt, dev) {
> >         m <- if (any(n > 1))
> >             n
> >         else wt
> >         -2 * sum(ifelse(m > 0, (wt/m), 0) * dbinom(round(m *
> >             y), round(m), mu, log = TRUE))
> >     }
> >     initialize <- expression({
> >         if (NCOL(y) == 1) {
> >             if (is.factor(y)) y <- y != levels(y)[1]
> >             n <- rep.int(1, nobs)
> >             if (any(y < 0 | y > 1)) stop("y values must be 0 <= y <= 1")
> >             mustart <- (weights * y + 0.5)/(weights + 1)
> >             m <- weights * y
> >             if (any(abs(m - round(m)) > 0.001))
> >                  warning("non-integer successes in a binomial glm!")
> >         } else if (NCOL(y) == 2) {
> >             if (any(abs(y - round(y)) > 0.001))
> >                 warning("non-integer counts in a binomial glm!")
> >             n <- y[, 1] + y[, 2]
> >             y <- ifelse(n == 0, 0, y[, 1]/n)
> >             weights <- weights * n
> >             mustart <- (n * y + 0.5)/(n + 1)
> >         } else stop("for the binomial family,",
> >                 " y must be a vector of 0 and 1's\n",
> >             "or a 2 column matrix where col 1 is",
> >             " no. successes and col 2 is no. failures")
> >     })
> >     structure(list(family="binomial", link=logexp(ExposureDays),
> >        linkfun=logexp(ExposureDays)$linkfun,
> >        linkinv=logexp(ExposureDays)$linkinv, variance=variance,
> >        dev.resids=dev.resids, aic=aic,
> >        mu.eta=logexp(ExposureDays)$mu.eta, initialize=initialize,
> >        validmu=validmu, valideta=logexp$valideta), class = "family")
> > }
> >
> >
> >
> > Now, here's how it works in a GLM:
> >
> > > apfa.glm.1<-glm(Success~MeanAge+I(MeanAge^2), family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> > > summary(apfa.glm.1)
> >
> > Call:
> > glm(formula = Success ~ MeanAge + I(MeanAge^2), family =
> > logexposure(link = "logexp",
> >     ExposureDays = apfa4$Days), data = apfa4)
> >
> > Deviance Residuals:
> >     Min       1Q   Median       3Q      Max
> > -3.1525   0.2802   0.3637   0.4291   0.7599
> >
> > Coefficients:
> >                Estimate Std. Error z value Pr(>|z|)
> > (Intercept)   5.5594830  0.6085542   9.136   <2e-16 ***
> > MeanAge      -0.0908251  0.0407218  -2.230   0.0257 *
> > I(MeanAge^2)  0.0014926  0.0006104   2.445   0.0145 *
> > ---
> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > (Dispersion parameter for binomial family taken to be 1)
> >
> >     Null deviance: 323.58  on 661  degrees of freedom
> > Residual deviance: 285.65  on 659  degrees of freedom
> > AIC: 291.65
> >
> > Number of Fisher Scoring iterations: 6
> >
> >
> >
> > Next, here's the results of a glmmPQL run:
> >
> > > apfa.glmm.1<-glmmPQL(Success~MeanAge+I(MeanAge^2), random=~1|Territory, family=logexposure(link="logexp", ExposureDays=apfa4$Days), data=apfa4)
> > iteration 1
> > > summary(apfa.glmm.1)
> > Linear mixed-effects model fit by maximum likelihood
> >  Data: apfa4
> >   AIC BIC logLik
> >    NA  NA     NA
> >
> > Random effects:
> >  Formula: ~1 | Territory
> >          (Intercept) Residual
> > StdDev: 0.0003431913 1.051947
> >
> > Variance function:
> >  Structure: fixed weights
> >  Formula: ~invwt
> > Fixed effects: Success ~ MeanAge + I(MeanAge^2)
> >                  Value Std.Error  DF   t-value p-value
> > (Intercept)   5.559466 0.6416221 624  8.664705  0.0000
> > MeanAge      -0.090824 0.0429346 624 -2.115397  0.0348
> > I(MeanAge^2)  0.001493 0.0006436 624  2.319090  0.0207
> >  Correlation:
> >              (Intr) MeanAg
> > MeanAge      -0.927
> > I(MeanAge^2)  0.826 -0.968
> >
> > Standardized Within-Group Residuals:
> >         Min          Q1         Med          Q3         Max
> > -11.3646020   0.1901969   0.2485473   0.2951632   0.5499915
> >
> > Number of Observations: 662
> > Number of Groups: 36
> >
> >
> >
> > Finally, here's what happens when I try to run an LMER model (same
> > error messages no matter which estimation method I choose):
> >
> > > apfa.lmer.1<-lmer(Success~MeanAge+I(MeanAge^2)+(1|Territory), data=apfa4, family=logexposure(link="logexp", ExposureDays=apfa4$Days), method="Laplace")
> > > summary(apfa.lmer.1)
> > Error in if (any(sd < 0)) return("'sd' slot has negative entries") :
> >         missing value where TRUE/FALSE needed
> > > names(apfa.lmer.1)
> > NULL
>
> > So, does anyone have any idea as to whether the problem is in the
> > user-defined link function as written, or have any thoughts about how
> > to get around this problem? If LMER and LME can't do it, could there
> > be some way to trick the glmmML function into accepting the
> > user-defined link function?
>
> lmer is designed to work with arbitrary families but I haven't done a
> lot of testing outside the binomial and poisson families.
>
> Try looking at the structure of an instance of your family and
> comparing it to, say,
>
> str(binomial())
>
> Make sure that all the components are there and have the correct form.
>
> The next step is to use verbose output so you can see the progress of
> the iterations.  Add the optional argument control = list(msVerbose =
> 1) to your call to lmer.
>
> Instead of immediately requesting a summary, use str(apfa.lmer.1) to
> check the structure.  Again you may want to compare this description
> to that from str applied to a similar fit using the binomial family.
>



More information about the R-help mailing list