[R] Intercept in Model Matrix (Parameters not what I expected)

Bert Gunter bgunter.4567 at gmail.com
Mon Aug 22 17:14:24 CEST 2016


Justin:

As you have not yet received any reply...

Your question is mostly about statistics (linear models) and, as such,
is typically off topic here. Briefly, you do seem confused about
contrasts in linear models, but I am confused about your confusion,
and so may be of little help. However....

Note that in your little 8 run example design, the response lives in 8
dims, and so your model matrix can have at most 8 independent columns.
~(A+B) has 4, which, using contr.treatment treatments could be
Intercept, A2,B2, B3 (since (B3+B4) - (B2+B1) is confounded with (A2 -
A1), where these are "dummy" encodings of 0 and 1). Adding all
pairwise products of the non-intercept columns  would not give you any
more, as all are all 0's. I do not know the algorithm that lm/aov uses
to choose which of the contrasts to estimate, but it makes no
difference: there can only be 3 beyond the intercept, and all others
are linear combinations of these.

If this is not useful to you, either:

1. Hope for a response here that is more helpful;
2. Consult a local statistical expert;
3. Read up on linear models (there are multiple books and internet sources);
4. Post on stats.stackexchange.com again.

Cheers,
Bert

## Note to others. If I have erred in any of the above, PLEASE CORRECT.




Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Sun, Aug 21, 2016 at 6:44 PM, Justin Thong <justinthong93 at gmail.com> wrote:
> I have something which has been bugging me and I have even asked this on
> cross validated but I did not get a response.  Let's construct a simple
> example. Below is the code.
>
> A<-gl(2,4) #factor of 2 levels
> B<-gl(4,2) #factor of 4 levels
> df<-data.frame(y,A,B)
>
> As you can see, B is nested within A.
> The peculiar result I am interested in the output of the model matrix when
> I fit for a nested model . *How does R decide what is included inside the
> intercept?* Since we are using dummy coding, the coefficients of the model
> is interpreted as the difference between a particular level and the
> reference level/the intercept for an single factor model. I understand for
> model ~A, A1 becomes the intercept and that for model ~A+B, A1 and B1
> (both) become the intercept.
>
> *I do not get why when we use a nested model, A1:B2 appears as a column
> inside the model matrix. Why isn't the first parameter of the interaction
> subspace A1:B1 or A2:B1? *I think I am missing the concept. I think the
> intercept is A1. *Hence, Why do we not compare the levels of A1:B1 and
> A1(intercept)  or A2:B1 and A1(intercept)?*
>
> #nested model
>> mod<-aov(y~A+A:B)
>> model.matrix(mod)
>   (Intercept) A2 A1:B2 A2:B2 A1:B3 A2:B3 A1:B4 A2:B4
> 1           1  0     0     0     0     0     0     0
> 2           1  0     0     0     0     0     0     0
> 3           1  0     1     0     0     0     0     0
> 4           1  0     1     0     0     0     0     0
> 5           1  1     0     0     0     1     0     0
> 6           1  1     0     0     0     1     0     0
> 7           1  1     0     0     0     0     0     1
> 8           1  1     0     0     0     0     0     1
>
>
> --
> Yours sincerely,
> Justin
>
> *I check my email at 9AM and 4PM everyday*
> *If you have an EMERGENCY, contact me at +447938674419(UK) or
> +60125056192(Malaysia)*
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list