[R] glmmPQL questions

Mon Mar 28 13:06:25 CEST 2005

I am looking a risk factors for disease in cattle and am interested in modelling
farm and sampling cluster as random effects (My outcome is positive or negative
at the level of the farm).  I am using R version 2.0.1 on a Mac and have
identified glmmPQL as hopefully the correct function to use. I have run a
couple of models using this but was hoping that you might be able to answer a
few questions.

e.g. model<-glmmPQL(farmstatus~cattlenumber,random~1|farm,binomial)

I am pretty new to both R and stats so if these questions are very simple and I
am just missing something, suggestions about good texts on GLMM in R would be
great.

First up, what is the best way to constrain the model to only look at certain
levels of a multi-level factor e.g. a categorisation of cattle number where all
points of high influence

(as determined using: summary(influence.measures(model)) )

are confined to the largest class (D) and I want to run the model which just
looks at levels A,B and C? (or only months May-September..)

I was also wondering about the best way to force specified variables to remain
in the model when running e.g. stepwise selection of interaction terms?

Finally, is there is a recognised method for dealing with missing values in
these models?
and as a minor point the models do not run unless i specify the data= part of
the syntax and as this is apparently an optional piece of information I was
wondering why this is required when all of my variables are in the same data
frame (and even when this data frame is attached?)

Any help would be greatly appreciated

Jo Halliday
MSc student
University of Edinburgh
s0454869 at sms.ed.ac.uk