[R] glmm in R

Diego Rubolini diego.rubolini at unipv.it
Thu Aug 26 16:50:34 CEST 2004


Dear all,

I'm new to R and to the list, and I have a problem which I'm unable to solve.

Consider the following simple simulated data frame:

simul<-data.frame(Group=factor(c(rep(1,4),rep(2,2),rep(3,6),rep(4,8),rep(5,4))),Pair=factor(rep(1:12,rep(2,12))),Y=rep(c(0,1),12),X1=c(9,10,8,9,1,5,2,7,0,3,7,9,5,4,6,8,4,8,10,4,1,6,2,3),X2=c(1,3,2,5,2,4,0,1,3,7,8,10,3,4,5,5,4,8,10,12,0,3,3,4),Day=(c(1,1,2,2,2,2,1,1,2,2,3,3,2,2,4,4,5,5,7,7,1,1,2,2)))

I'd like to fit a logistic mixed model with two random effects, one is a 
subject level random effect (Group) and another is a pair-level random 
effect (Pair), which is nested within Group. Furthermore, I have a Day 
variable (a time variable), which has the same value for each Y pair. As 
far as I could understand, the syntax for the nested random effect should 
be as follows (but I'm not sure about this):

random = ~ 1|Pair/Group

I used GLMM (from lme4) and glmPQL (from MASS) functions using the 
following model specifications:

fit <- GLMM(Y ~ X1+X2+Day+X1*Day+X2*Day, family = binomial, data = 
simul,  random = ~ 1|Pair/Group)
summary(fit)

fit1 <- glmmPQL(Y ~ X1+X2+Day+X1*Day+X2*Day, family = binomial, data = 
simul,  random = ~ 1|Pair/Group)
summary(fit1)

My questions are (in order of importance):

1) is the formulation correct for the nested random effect?

2) why do the results of the two packages give different std. errors and 
P-values for parameter estimates, which are indeed very similar? glmmPQL 
tend to give greater errors and higher P's than GLMM. Further, it takes 
longer to produce the output. They both implement PQL estimation, I 
believe. Which function is best? I believe it should be that with smaller se's.

3) I wish also to model Day as a random intercept in Group (but not in Pair 
nested in Group), but I do not know if this is possible. Does anybody have 
any suggestion?

4) X2 and Day are correlated by nature. Is this giving any problem for 
parameter estimates?

I have already bothered other statisticians about this, but haven't still 
resolved my case (you know who you are :-)). I've also been told that PQL 
estimates are flawed, and that I should better turn to other software (e.g. 
SAS PROC NLMIXED).

Thank you all for your attention,

Diego


_________________________________________________

  Diego Rubolini
  Dipartimento di Biologia Animale
  Università degli Studi di Pavia
  Piazza Botta 9
  I - 27100 Pavia
  Italy
  Email: diego.rubolini at unipv.it

  URL: http://www.unipv.it/webbio/labweb/ecoeto/




More information about the R-help mailing list