[R] gee niggles
farewelld at cf.ac.uk
Mon Dec 8 18:53:55 CET 2008
I'm not sure if the gee package is still actively maintained, but I for one find it extremely useful. However, I've come across a few infelicities that I'm hoping could be resolved for future versions. Hope it's okay to list them all in one post! They are:
(1) AR(1) models don't fit when clustsize = 1 for any subject, even if some subjects have clustsize > 1.
(2) If the working correlation matrix has dimension less than 4x4, it doesn't get printed.
(3) Using a "fixed" working correlation matrix stored in integer mode crashes R.
To illustrate, generate some data:
df <- data.frame(i = rep(1:5, each = 5), j = rep(1:5, 5), y = rep(rnorm(5), each = 5) + rnorm(25))
An AR(1) model fits fine to the full data:
gee(y ~ 1, id = i, df, corstr = "AR-M", Mv = 1)
So also when some subjects have fewer observations than others:
gee(y ~ 1, id = i, df, subset = j <= i + 1, corstr = "AR-M", Mv = 1)
(1) However, when any subject (in this case, the first) has only 1 observation, gee bails out:
gee(y ~ 1, id = i, df, subset = j <= i & j <= 2, corstr = "AR-M", Mv = 1)
I see no particular reason an AR(1) working structure shouldn't be fit to these data. In fact, when (as here) there are at most two observations per subject, the AR(1) and exchangeable structures are equivalent, and the latter fits fine:
gee(y ~ 1, id = i, df, subset = j <= i & j <= 2, corstr = "exchangeable")
(2) This brings up the second niggle. When all cluster sizes are less than four, gee fits fine but the print method tries to extract elements of the working correlation matrix that don't exist (x$working.correlation[1:4, 1:4]):
gee(y ~ 1, id = i, df, subset = j <= 3)
(3) Finally, we might want to explicitly enter the identity matrix as the working correlation. If we do it this way
gee(y ~ 1, id = i, df, corstr = "fixed", R = outer(1:5, 1:5, function(x, y) as.numeric(x == y)))
then all is well, but like this
# gee(y ~ 1, id = i, df, corstr = "fixed", R = outer(1:5, 1:5, function(x, y) as.integer(x == y))) # not run
crashes R. I think it has to do with the correlation matrix being stored in integer mode:
str(outer(1:5, 1:5, function(x, y) as.integer(x == y)))
Presumably an explicit conversion to numeric mode would take care of this. Here's my sessionInfo(), in case it's of relevance:
R version 2.7.2 (2008-08-25)
attached base packages:
 stats graphics grDevices utils datasets methods base
other attached packages:
Many thanks in advance!
More information about the R-help