[R] GLMM(..., family=binomial(link="cloglog"))?

Göran Broström gb at stat.umu.se
Wed Jun 9 15:38:30 CEST 2004


On Tue, Jun 08, 2004 at 08:32:24AM -0700, Spencer Graves wrote:
> Hi, Doug: 
> 
>      Thanks.  I'll try the things you suggests.  The observed 
> proportions ranged from roughly 0.2 to 0.8 in 100 binomial random 
> samples where sigma is at most 0.05.  Jim Lindsey's "glmm" does 
> Gauss-Hermite quadrature, but I don't know if it bothers with the 
> adaptive step.  With it, I've seen estimates of the variance component 
> ranging from 0.4 to 0.7 or so.  Since I simulated normal 0 standard 
> deviation of 1, the algorithm was clearly underestimating what was 
> simulated.  My next step, I think, is to program adaptive Gauss-Hermite 
> quadrature for something closer to my real problem (as you just 
> suggested), and see what I get. 
[...]

> Douglas Bates wrote:
> 
> >Spencer Graves <spencer.graves at pdf.com> writes:
> >
[...]
> >>	  Consider the following:
> >>
> >>> set.seed(1); N <- 10
> >>> z <- rnorm(N)
> >>> pz <- inv.logit(z)
> >>> DF <- data.frame(z=z, pz=pz, y=rbinom(N, 100, pz)/100, n=100,
> >>> smpl=factor(1:N))
> >>
> >>> GLMM(y~1, family=binomial, data=DF, random=~1|smpl, weights=n)

I think the "weights=n" is the problem, i.e., you summarize Bernoulli
data to Bin(100, p) data, and that gives a completely different estimate of
the variance of the random effect. (This looks as an error in lme4 to me,
or am I missing something? Doug?) Really, the two ways of representing data
should give equivalent analyses, but it doesn't. The same phenomenon
appears in 'glm', i.e. without random effects, but only for the residual
sum of squares, df, and AIC. 

The following is a small function calling lme4 twice, first as
before, and then with data 'wrapped up' into Bernoulli data. I also run
'glmmML' in the latter case (since glmmML only allows the Bernoulli
representation):

-------------------------------------------------------------------   
sim <- function(N = 10, grpSize = 10, std = 1){
    require(glmmML)
    require(lme4)
    
    set.seed(1)
    z <- rnorm(N, mean = 0, sd = std)
#    pz <- inv.logit(z); is identical to (in 'stats')
    pz <- plogis(z)
    Y <- rbinom(N, grpSize, pz)

    ## 'Summary' data frame:
    
    DF <- data.frame(z = z, pz = pz,
                     y = Y / grpSize,
                     n = grpSize,
                     smpl = factor(1:N))

    fit1 <- GLMM(y~1, family = binomial, data = DF,
                 random = ~1|smpl, weights = n)
    ##fit1 <- glm(y~1, family = binomial, data = DF, weights = n)


    ## 'Individual' data frame:

    n <- N * grpSize
    z <- rep(z, rep(grpSize, N))
    pz <- rep(pz, rep(grpSize, N))

    y <- numeric(n)
    for (i in 1:N) y[((i - 1) * grpSize + 1):((i-1)*grpSize + Y[i])] <- 1 
    smpl <- as.factor(rep(1:N, rep(grpSize, N)))
    
    bigDF <- data.frame(z = z, pz = pz, y = y,
                        smpl = smpl)
    fit2 <- GLMM(y~1, family=binomial, data=bigDF, random=~1|smpl)
    ##fit2 <- glm(y~1, family=binomial, data=bigDF)

    fitML <- glmmML(y ~ 1, family = binomial,
                    data = bigDF,
                    cluster = bigDF$smpl)
    
    return(list(fit1 = fit1,
                fit2 = fit2,
                fitML = fitML
                )
           )
}
----------------------------------------------------------------

Output:

$fit1
Generalized Linear Mixed Model

Family: binomial family with logit link
Fixed: y ~ 1 
Data: DF 
 log-likelihood:  -11.59919 
Random effects:
 Groups Name        Variance   Std.Dev.  
 smpl   (Intercept) 3.0384e-10 1.7431e-05

Estimated scale (compare to 1)  1.391217 

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.48955    0.20602  2.3762  0.01749

Number of Observations: 10
Number of Groups: 10 

$fit2
Generalized Linear Mixed Model

Family: binomial family with logit link
Fixed: y ~ 1 
Data: bigDF 
 log-likelihood:  -64.8084 
Random effects:
 Groups Name        Variance Std.Dev.
 smpl   (Intercept) 0.58353  0.7639  

Estimated scale (compare to 1)  0.9563351 

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.52811    0.32286  1.6357   0.1019

Number of Observations: 100
Number of Groups: 10 

$fitML

Call:  glmmML(formula = y ~ 1, data = bigDF, 
              cluster = bigDF$smpl, family = binomial) 


              coef se(coef)     z Pr(>|z|)
(Intercept) 0.5587   0.3313 1.686   0.0917

Standard deviation in mixing distribution:  0.764 
Std. Error:                                 0.3655 

Residual deviance: 129.5 on 98 degrees of freedom       AIC: 133.5 
---------------
Note the big difference in estimated random effect 'sd' in the two lme4
runs! Note further how close to each other the corresponding estimates for
the second run of lme4 and of glmmML are.

[...]

Göran
-- 
 Göran Broström                    tel: +46 90 786 5223
 Department of Statistics          fax: +46 90 786 6614
 Umeå University                   http://www.stat.umu.se/egna/gb/
 SE-90187 Umeå, Sweden             e-mail: gb at stat.umu.se




More information about the R-help mailing list