[R] How to model repeated measures negative binomial data with GEE or GLMM
B Hansen
bethanykaye4 at gmail.com
Mon Feb 26 17:28:37 CET 2018
Goal: use GEE or GLMM to analyze repeated measures data in R
GEE problem: can’t find a way to do GEE with negative binomial family in R
GLMM problem: not sure if I’m specifying random effect correctly
Study question: Does the interaction of director and recipient group affect
rates of a behavior?
Data:
Animals (n = 38) in one of 3 groups (life stages): B or C.
Some individuals (~5) transitioned between groups between observation
periods (2010, 2011, 2012), e.g., transitioning from B -> C.
I gathered data on individuals in groups B and C, recording how often they
directed a behavior to individuals in groups A, B, or C.
I have multiple measures for each director (both within and between years).
For example, for an individual who was alive for the entire study, I have
one count of the behavior directed toward groups A, B, and C for each of
the three years of study (total = 9 counts). Some individuals were observed
all three years, others were only observed one or two years.
Approach 1:
Initially I used GEE in SPSS, the software I initially learned on (but no
longer have access to)
Outcome variable: counts of directed behaviors (“Diract”) from directors
(“Dir” in group B or C) to members of groups A, B, C (“Rec"). Values range
from 0-4, with overdispersion.
Offset: the amount of time the focal individual was observed with
individuals from the recipient group (natural log transformation applied,
“LnScan”)
Explanatory variable: the interaction of director and recipient group
(“Dir*Rec”)
Fixed effect: “Year”
Family: Negative binomial with log link function
Exchangeable working correlation matrix
I hoped to rerun the analyses in R, which I am now learning to use.
However, I cannot find a straightforward way to run a GEE in R with a
negative binomial family. I am able to code what I want using a Poisson
distribution using package geeglm:
library("geeglm")
m1 <- geeglm(Diract ~ Dir*Rec + Year + offset(LnScan), family =
poisson("log"), data=Direct, id=ID, corstr="exchangeable")
The lack of a negative binomial option for GEE in R has been addressed in
the past few years here:
https://www.researchgate.net/post/Does_anyone_know_how_to_undertake_Generalized_Estimating_Equation_GEE_modelling_using_the_negative_binomial_distribution_in_R
and here:
http://r.789695.n4.nabble.com/Negative-Binomial-Regression-td861977.html
A similar question here is unanswered:
https://stats.stackexchange.com/questions/83957/fit-negbin-glm-model-with-autoregressive-correlation-structure
I wonder if there are any newer developments, since the posts are a few
years old. I’m not very advanced in R, so if the solution involves a lot of
creative coding, I won’t likely be able to figure it out. I have already
tried using:
library("sos")
findFn("{generalized estimating equation}")
and researching every package listed. Most seem to leverage gee (JGEE) or
geepack (wgeesel), or lack a negative binomial family (PGEE, spind).
Approach 2: GLMM
I have become more familiar with GLMMs in R, so perhaps that is a better
approach. I tried running GLMMs with package glmmTMB, but I am not sure I
specified the random effect correctly (am I properly accounting for the
repeated measures within AND between years?):
m2 <- glmmTMB(Diract ~ DirPar*RecPar + offset(LnScan) + Year + (1|ID),
data=Direct, family=list(family="nbinom1",link="log"))
I further tried to specify a compound symmetry covariance structure with
glmmTMB, but this failed:
m2a <- glmmTMB(Diract ~ DirPar*RecPar + offset(LnScan) + Year + cs(1|ID),
data=Direct,family=list(family="nbinom1",link="log"))
Warning message:
In fitTMB(TMBStruc) :
Model convergence problem; non-positive-definite Hessian matrix. See
vignette('troubleshooting')
I also tried Ben Bolker’s suggestion (posted on Nabble; see link above) to
use glmmPQL, but I got an error message:
m3 <- glmmPQL(Diract ~ Dir*Rec + offset(LnScan) + Year, random = ~ 1 | ID,
family = negative.binomial(1), data = Direct,
correlation=corCompSymm(form=~1|ID))
Error in glmmPQL(Diract ~ DirPar * RecPar + offset(LnScan) + Year, random =
~1 | : could not find function "corCompSymm"
If anyone has tips for a) a GEE with a negative binomial family and/or b)
making sure I am specifying the random effect in a GLMM correctly to
account for multiple measures within and across years, that would be
greatly appreciated. Thank you from a self-taught but passionate R user!
Bethany K. Hansen, PhD
Chimp Haven sanctuary
[[alternative HTML version deleted]]
More information about the R-help
mailing list