[R] SE for all fixed factor effect in GLMM

Wed Jan 2 00:29:18 CET 2019

Please keep communications on-list.

On 1/2/19 10:57 AM, Marc Girondot wrote:
> Le 01/01/2019 à 22:35, Rolf Turner a écrit :
>> On 1/2/19 9:35 AM, Marc Girondot wrote:
>>> Hello members of the list,
>>>
>>> I asked 3 days ago a question about "how to get the SE of all effects 
>>> after a glm or glmm". I post here a synthesis of the answer and a new 
>>> solution:
>>>
>>> For example:
>>>
>>> x <- rnorm(100)
>>>
>>> y <- rnorm(100)
>>>
>>> G <- as.factor(sample(c("A", "B", "C", "D"), 100, replace = TRUE)); G 
>>> <- relevel(G, "A")
>>>
>>>
>>> m <- glm(y ~ x + G)
>>>
>>> summary(m)$coefficients
>>>
>>>
>>> No SE for A level in G category is calculated.
>>>
>>>
>>> * Here is a synthesis of the answers:
>>>
>>>
>>> 1/ The first solution was proposed by Rolf Turner 
>>> <r.turner using auckland.ac.nz>. It was to add a + 0 in the formula and 
>>> then it is possible to have the SE for the 4 levels (it works also 
>>> with objects obtained with lme4:lmer() ):
>>>
>>> m1 <- glm(y ~ x + G +0)
>>>
>>> summary(m1)$coefficients
>>>
>>>
>>> However, this solution using + 0 does not works if more than one 
>>> category is included. Only the levels of the first one have all the 
>>> SE estimated.
>>
>> Well, you only asked about the setting in which there was only one 
>> categorical predictor.  If there are, e.g. two (say "G" and "H") try
>>
>> m2 <- glm(y ~ x + G:H + 0)
>>
>> I would suggest that you learn a bit about how the formula structure 
>> works in linear models.
>>
>> cheers,
>>
>> Rolf Turner
>>
>> P.S.  Your use of relevel() is redundant/irrelevant in this context.
>>
>> R. T.
>>
> Thanks for the advises. But based on my little knowledge of formula 
> structure in linear models, A+B is not the same than A:B.

That is very true!  But I never suggested using "A+B".  In the context 
of an additive model there is *NO WAY* to make sense of parameters 
corresponding to each level of each factor.  Consequently there can be 
no way to form estimates of such parameters or of the standard errors of 
such estimates.  They cannot be made meaningful.  (This is, in effect, 
the reason for the existence of the --- rather confusing --- 
over-parametrised model.)

> The first structure used 6 parameters and the second one 14.

Well, it depends on how many levels each of A and B has!  But yes, the 
numbers of parameters will be different.  They are different models.

> Then adding 
> +0 does not solve the problem... or perhaps I am wrong ?
> 
> Thanks for your time.

For an *additive* model "+0" does indeed not solve the problem.  In this 
context the "problem" has no solution.

You might get some insight by reading about "the cell means model" in
"Linear Models" by Shayle R. Searle:

@book{searle1997,
   title={Linear Models},
   author={Searle, S.R.},
   isbn={9780471184997},
   year={1997},
   publisher={Wiley}
}

If you use the model I suggested (i.e. G:H + 0) you get an explicit 
estimate for each cell mean, and the standard errors of these estimates.

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276