[R] lmer binomial model overestimating data?

Sat Jun 17 19:09:04 CEST 2006

<see inline>	

Martin Henry H. Stevens wrote:
> Hi folks,
> Warning: I don't know if the result I am getting makes sense, so this  
> may be a statistics question.
> 
> The fitted values from my binomial lmer mixed model seem to  
> consistently overestimate the cell means, and I don't know why. I  
> assume I am doing something stupid.

	  I copied your "Raw.Data" and "Fitted.Estimates" into two columns of a 
data.frame OE and then did a t test as follows:
 > with(OE, t.test(Raw.Data-Fitted.Estimates))

	One Sample t-test

data:  Raw.Data - Fitted.Estimates
t = -2.1662, df = 11, p-value = 0.05313
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
  -0.0991997752  0.0007897752
sample estimates:
mean of x
-0.049205

	  I also tried the same thing with a modification of an example in the 
"lmer" help file.  In that case, lmer seemed to consistently 
UNDERestimate the cell means.  However, the difference again was not 
statistically significant.

	  I suggest you consider this a disguised Rorschach ink blot test and 
worry about other things.

	  Hope this helps.
	  Spencer Graves	

> 
> Below I include code, and a binary image of the data is available at  
> this link:
> http://www.cas.muohio.edu/~stevenmh/tf.RdataBin
> 
> This was done with `Matrix' version 0.995-10 and `lme4' version  
> 0.995-2. and R v. 2.3.1 on a Mac, OS 10.4.6.
> 
> The binomial model below ("mod") was reduced from a more complex one  
> by first using AIC, BIC and LRT for "random" effects, and then  
> relying on Helmert contrasts and AIC, BIC, and LRT to simplify fixed  
> effects. Maybe this was wrong?
> 
>  > load("tf.RdataBin")
>  > library(lme4)
> 
>  > options(contrasts=c("contr.helmert", "contr.poly"))
>  > mod <- lmer(tfb ~ reg+nutrient+amd +reg:nutrient+
> +     (1|rack) + (1|popu) +  (1|gen), data=dat.tb2, family=binomial,  
> method="Laplace")
>  > summary(mod)
> Generalized linear mixed model fit using Laplace
> Formula: tfb ~ reg + nutrient + amd + reg:nutrient + (1 | rack) + (1  
> |      popu) + (1 | gen)
> 	  Data: dat.tb2
> Family: binomial(logit link)
>      AIC    BIC  logLik deviance
> 402.53 446.64 -191.26   382.53
> Random effects:
> Groups Name        Variance Std.Dev.
> gen    (Intercept) 0.385    0.621
> popu   (Intercept) 0.548    0.741
> rack   (Intercept) 0.401    0.633
> number of obs: 609, groups: gen, 24; popu, 9; rack, 2
> 
> Estimated scale (compare to 1)  0.80656
> 
> Fixed effects:
>                 Estimate Std. Error z value Pr(>|z|)
> (Intercept)       2.391      0.574    4.17  3.1e-05
> reg1              0.842      0.452    1.86  0.06252
> reg2              0.800      0.241    3.32  0.00091
> nutrient1         0.788      0.197    4.00  6.3e-05
> amd1             -0.540      0.139   -3.88  0.00010
> reg1:nutrient1    0.500      0.227    2.21  0.02734
> reg2:nutrient1   -0.176      0.146   -1.21  0.22794
> 
> Correlation of Fixed Effects:
>              (Intr) reg1   reg2   ntrnt1 amd1   rg1:n1
> reg1         0.169
> reg2        -0.066 -0.191
> nutrient1    0.178  0.231 -0.034
> amd1        -0.074 -0.044 -0.052 -0.078
> reg1:ntrnt1  0.157  0.307 -0.180  0.562 -0.002
> reg2:ntrnt1 -0.028 -0.154  0.236  0.141  0.033 -0.378
>  > X <- mod @ X
>  > fitted <- X %*% fixef(mod)
>  >
>  > unlogitH <- function(x) {( 1 + exp(-x) )^-1}
>  > (result <- data.frame(Raw.Data=with(dat.tb2,
> +                          tapply(tfb, list(reg:nutrient:amd),
> +                          mean ) ),
> +         Fitted.Estimates=with(dat.tb2,
> +                          tapply(fitted, list(reg:nutrient:amd),
> +                          function(x) unlogitH(mean(x))  ) )  ))
>                 Raw.Data Fitted.Estimates
> SW:1:unclipped  0.50877          0.69520
> SW:1:clipped    0.41304          0.43669
> SW:8:unclipped  0.67273          0.85231
> SW:8:clipped    0.52830          0.66233
> NL:1:unclipped  0.88889          0.81887
> NL:1:clipped    0.53571          0.60578
> NL:8:unclipped  0.96552          0.98830
> NL:8:clipped    0.96154          0.96635
> SP:1:unclipped  0.98649          0.98361
> SP:1:clipped    0.92537          0.95328
> SP:8:unclipped  1.00000          0.99308
> SP:8:clipped    0.95890          0.97992
>  > ### Perhaps the cell SP:8:clipped = 1.0 is messing up the fit?
>  > pdf("RawAndFitted.pdf")
>  > par(mar=c(8,3,2,2), las=2)
>  > barplot(t(result), beside=TRUE )
>  > box(); title("Fractions of Plants Producing Fruits")
>  > legend("topleft", c("Raw Data", "Fitted Values"),
> +        fill=gray.colors(2), bty="n" )
>  > dev.off()
> quartz
>       2
>  >
> 
>                 _
> platform       powerpc-apple-darwin8.6.0
> arch           powerpc
> os             darwin8.6.0
> system         powerpc, darwin8.6.0
> status
> major          2
> minor          3.1
> year           2006
> month          06
> day            01
> svn rev        38247
> language       R
> version.string Version 2.3.1 (2006-06-01)
>  >
> 
> Dr. M. Hank H. Stevens, Assistant Professor
> 338 Pearson Hall
> Botany Department
> Miami University
> Oxford, OH 45056
> 
> Office: (513) 529-4206
> Lab: (513) 529-4262
> FAX: (513) 529-4243
> http://www.cas.muohio.edu/~stevenmh/
> http://www.muohio.edu/ecology/
> http://www.muohio.edu/botany/
> "E Pluribus Unum"
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html