[R] R vs Stata on generalized linear mixed models: glmer and xtmelogit

Tue Aug 19 14:53:45 CEST 2008

On Mon, Aug 18, 2008 at 7:55 PM,  <Antonio.Gasparrini at lshtm.ac.uk> wrote:
> Hello,
> I have compared the potentials of R and Stata about GLMM, analysing the dataset 'ohio' in the package 'faraway' (the same dataset is analysed with GEE in the book 'Extending the linear model with R' by Julian Faraway).
> Basically, I've tried the 2 commands 'glmmPQL' and 'glmer' of R and the command 'xtmelogit' of Stata. If I'm not wrong, 'glmer' uses the Laplacian approximation as default, corresponding to adaptive Gauss-Hermite approximation with only 1 point, while 'xtmelogit' uses 7 points. In order to compare them, I tried also to change the corresponding parameters.
>
> This is the code for R:
>
> rm(list=ls())
> library(faraway); library(lme4); library(MASS)
> data <- ohio
> pql  <- glmmPQL(resp~smoke+factor(age), random=~1|id, family=binomial,data)
> summary(pql)$tTable["smoke",1:2]
> lap <- glmer(resp~smoke+factor(age)+(1|id), family=binomial,data)
> attributes(summary(lap))$coefs["smoke",1:2]
> agq7 <- glmer(resp~smoke+factor(age)+(1|id),nAGQ=7,family=binomial,data)
> write.csv(data,file="data.csv")
>
> This is the code for Stata:
>
> clear
> insheet using data.csv
> xi: xtmelogit resp smoke i.age, || id:, covariance(independent) laplace
> xi: xtmelogit resp smoke i.age, || id:, covariance(independent)
>
> Results:
> - Both the point estimate and the standard error for the fixed effect, and the standard deviation for random effect of 'glmmPQL' are lower than 'glmer'
> - 'glmer' and 'xtmelogit' with Laplacian approximation give very similar results. 'xtmelogit' with 7 points gives similar point estimates for fixed effects, but a different (lower) estimate for the standard deviation of the random effect (as expected)
> - glmer doesn't work with the parameter 'nAGO' (number of points) set to 7, returning 'Code not yet written'
>
> My questions:
> 1) Is the difference between 'glmmPQL' and 'glmer' expected? Which is more reliable?

Yes, the difference is to be expected.  The results from glmer should
better approximate the maximum likelihood estimates.  The Laplace
approximation is a direct approximation to the log-likelihood of the
model at the given parameter estimates.  The PQL method is an indirect
method of determining parameter estimates.  Also, glmmPQL is based on
the lme function from the nlme package and the model representation
and optimization methods in the lme4 package are considerably more
advanced than those in the nlme package.

> 2) Is there a way to set the parameter 'nAGO' to 7 or perform the same analysis in another way?

You have to wait until Bin Dai incorporates the AGQ code from his
Google Summer of Code (SoC) project into the release version of the
lme4 package.  Officially the SoC projects ended yesterday so that
should occur soon.

> 3) Is there a tutorial on the use of lme4, especially to handle the results (summary, coef, etc.)?

The documentation is still quite scattered.  I am better at changing
code than I am at updating the documentation to reflect the current
state of the code.

I think the best current documentation by me on the use of lme4 and
some of the background on mixed models are the slides for my
presentations in a recent workshop at the University of Potsdam.  They
are available at http://www.stat.wisc.edu/~bates/PotsdamGLMM/ as files
LMMD.pdf and GLMMD.pdf    The Sweave sources are also available there
as are the scripts extracted from the Sweave sources (in the scripts
directory).

>
> Thank you for your time
>
> Antonio Gasparrini
> Public and Environmental Health Research Unit (PEHRU)
> London School of Hygiene & Tropical Medicine
> Keppel Street, London WC1E 7HT, UK
> Office: 0044 (0)20 79272406 - Mobile: 0044 (0)79 64925523
> http://www.lshtm.ac.uk/people/gasparrini.antonio ( http://www.lshtm.ac.uk/pehru/ )
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>