[R] Gee with nested desgin

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Wed Apr 22 13:20:41 CEST 2009


Dear Dimitris,

Thanks for your prompt reply.

I agree that the specification of the correlation structure in theory
only affects the efficiency. But I noticed that the fixed effects change
dramatically when comparing the unstructured correlation with my other
attempts. With most models the effect of Year is about -0.12. But with
the unstructured correlation structure it switches sign. So the effect
of Year becomes 0.10. In all models the effect of Year is strongly
significant.

Therefore I think I will stick to the model with PlotID as id and an
exchangeable correlation structure.

Regards,

Thierry


------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: Dimitris Rizopoulos [mailto:d.rizopoulos at erasmusmc.nl] 
Verzonden: woensdag 22 april 2009 11:48
Aan: ONKELINX, Thierry
CC: r-help at r-project.org
Onderwerp: Re: [R] Gee with nested desgin

Hi Thierry,

in geeglm() and in connection with the GEE theory, argument "id" should 
identify the independent sample units in your data set, which seem to be

the plots. GEE will produce consistent estimates even if you misspecify 
the correlation structure in your data (i.e., ignore nested design). If 
you specify it correctly, you gain in efficiency. So, in your case you 
may gain a little (or a lot) if you specify your own nested working 
correlation matrix. In geeglm(), argument 'corstr' gives the option for 
a user-defined correlation matrix, but I don't know how exactly this 
needs to be defined.

Regarding the 'waves' argument and AFAIK, this used in connection with 
the AR1 working correlation structure in order to identify the order and

spacing of the observations (i.e., if subject 1 has visits at time 1 and

time 2, whereas subject 2 has visits at time 1 and time 3).

A final thing: you said that you have some missing values. In this case 
note that the GEE approach will be valid (i.e., produce unbiased 
estimates) if the missing data mechanism is Missing Completely At 
Random, that is when the reasons for missingness do not depend on Y.

I hope it helps.

Best,
Dimitris


ONKELINX, Thierry wrote:
> Dear all,
> 
> Is it possible to incorporate a nested design in GEE? I have
> measurements on trees that where measured in two years. The trees are
> nested in plots. Each plot contains 24 trees. The number of plots is
72.
> Hence we would expect 2 * 24 * 72 = 3456 data points. A few are
missing,
> so we end up wih 3431 data points.
> 
> This is what I have tried until now.
> 
> #assuming independence between trees and thus ignoring the plot level.
> library(geepack)
> geeglm(formula = Y ~ Year, id = TreeID, family = binomial, corstr =
> "exchangeable")
> 
> #using waves. But I'm wondering if this is correct.
> library(geepack)
> geeglm(formula = Y ~ Year, id = PlotID, waves = TreeID, family =
> binomial, corstr = "exchangeable")
> 
> #using a unstructured correlation on the plot level. geeglm with
> unstructured correlation resulted in an out of memory error.
> library(Zelig)
> zelig(formula = Y ~ Year, model = "logit.gee", id = "PlotID", corstr =
> "unstructured")
> 
> #Ideally I think I need a correlation matrix structured like below
> (given for a plot with 3 trees). Here a1 is the correlation within a
> tree, a2 the correlation between trees from the same plot and the same
> year and a3 the correlation between trees from the same plot but a
> different year. Does it make sense to run the model with a
unstructured
> correlation, calculate the average a1, a2 and a3 and use that as a
fixed
> working correlation?
> 
> matrix(c(
> 	1,    "a1", "a2", "a3", "a2", "a3", 
> 	"a1", 1,    "a3", "a2", "a3", "a2", 
> 	"a2", "a3", 1,    "a1", "a2", "a3", 
> 	"a3", "a2", "a1", 1,    "a3", "a2", 
> 	"a2", "a3", "a2", "a3", 1,    "a1", 
> 	"a3", "a2", "a3", "a2", "a1", 1    
> 	), nrow = 6)
> 
> Best regards,
> 
> Thierry
> 
>
------------------------------------------------------------------------
> ----
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature
> and Forest
> Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
> methodology and quality assurance
> Gaverstraat 4
> 9500 Geraardsbergen
> Belgium 
> tel. + 32 54/436 185
> Thierry.Onkelinx at inbo.be 
> www.inbo.be 
> 
> To call in the statistician after the experiment is done may be no
more
> than asking him to perform a post-mortem examination: he may be able
to
> say what the experiment died of.
> ~ Sir Ronald Aylmer Fisher
> 
> The plural of anecdote is not data.
> ~ Roger Brinner
> 
> The combination of some data and an aching desire for an answer does
not
> ensure that a reasonable answer can be extracted from a given body of
> data.
> ~ John Tukey
> 
> Dit bericht en eventuele bijlagen geven enkel de visie van de
schrijver weer 
> en binden het INBO onder geen enkel beding, zolang dit bericht niet
bevestigd is
> door een geldig ondertekend document. The views expressed in  this
message 
> and any annex are purely those of the writer and may not be regarded
as stating 
> an official position of INBO, as long as the message is not confirmed
by a duly 
> signed document.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.




More information about the R-help mailing list