[Rd] Bugs? when dealing with contrasts

Peter Dalgaard pdalgd at gmail.com
Wed Apr 21 22:26:08 CEST 2010


Gabor Grothendieck wrote:
> R version 2.10.1 (2009-12-14)
> Copyright (C) 2009 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> 
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
> 
>   Natural language support but running in an English locale
> 
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> Below are two cases where I don't seem to be getting contr.sum
> contrasts even though they were specified.  Are these bugs?
> 
> The first case is an interaction between continuous and factor variables.
> 
> The second case contrasts= was specified as an arg to lm.  The second
> works ok if we set the contrasts through options but not if we set it
> through an lm argument.

In case #2, I think you just failed to read the help page.

> model.matrix(~fac, contrasts=list(fac="contr.sum"))
   (Intercept) fac1 fac2
1            1    1    0
2            1    1    0
3            1    1    0
4            1    1    0
5            1    1    0
6            1    0    1
7            1    0    1
8            1    0    1
9            1    0    1
10           1    0    1
11           1   -1   -1
12           1   -1   -1
13           1   -1   -1
14           1   -1   -1
15           1   -1   -1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$fac
[1] "contr.sum"

As for case #1, the rules are tricky in cases where interactions are
present without main effects, but AFAICS, what you observe is
essentially the same effect as

> model.matrix(~fac-1, contrasts=list(fac="contr.sum"))
   fac1 fac2 fac3
1     1    0    0
2     1    0    0
3     1    0    0
4     1    0    0
5     1    0    0
6     0    1    0
7     0    1    0
8     0    1    0
9     0    1    0
10    0    1    0
11    0    0    1
12    0    0    1
13    0    0    1
14    0    0    1
15    0    0    1
attr(,"assign")
[1] 1 1 1
attr(,"contrasts")
attr(,"contrasts")$fac
[1] "contr.sum"


I.e., that R reverts to using indicator variables when the intercept is
absent.

> 
>> # 1. In this case I don't seem to be getting contr.sum contrasts:
>>
>> options(contrasts = c("contr.sum", "contr.poly"))
>> getOption("contrasts")
> [1] "contr.sum"  "contr.poly"
>> scores <- rep(seq(-2, 2), 3); scores
>  [1] -2 -1  0  1  2 -2 -1  0  1  2 -2 -1  0  1  2
>> fac <- gl(3, 5); fac
>  [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
> Levels: 1 2 3
>> # I get this:
>> model.matrix(~ scores:fac)
>    (Intercept) scores:fac1 scores:fac2 scores:fac3
> 1            1          -2           0           0
> 2            1          -1           0           0
> 3            1           0           0           0
> 4            1           1           0           0
> 5            1           2           0           0
> 6            1           0          -2           0
> 7            1           0          -1           0
> 8            1           0           0           0
> 9            1           0           1           0
> 10           1           0           2           0
> 11           1           0           0          -2
> 12           1           0           0          -1
> 13           1           0           0           0
> 14           1           0           0           1
> 15           1           0           0           2
> attr(,"assign")
> [1] 0 1 1 1
> attr(,"contrasts")
> attr(,"contrasts")$fac
> [1] "contr.sum"
> 
>> # But I was expecting this since I am using contr.sum
>> cbind(1, model.matrix(~ fac)[,2:3] * scores)
>      fac1 fac2
> 1  1   -2    0
> 2  1   -1    0
> 3  1    0    0
> 4  1    1    0
> 5  1    2    0
> 6  1    0   -2
> 7  1    0   -1
> 8  1    0    0
> 9  1    0    1
> 10 1    0    2
> 11 1    2    2
> 12 1    1    1
> 13 1    0    0
> 14 1   -1   -1
> 15 1   -2   -2
>>
>> # 2.
>> # here I don't get contr.sum but rather get contr.treatment
>> options(contrasts = c("contr.treatment", "contr.poly"))
>> getOption("contrasts")
> [1] "contr.treatment" "contr.poly"
>> model.matrix(lm(seq(15) ~ fac, contrasts = c("contr.sum", "contr.poly")))
>    (Intercept) fac2 fac3
> 1            1    0    0
> 2            1    0    0
> 3            1    0    0
> 4            1    0    0
> 5            1    0    0
> 6            1    1    0
> 7            1    1    0
> 8            1    1    0
> 9            1    1    0
> 10           1    1    0
> 11           1    0    1
> 12           1    0    1
> 13           1    0    1
> 14           1    0    1
> 15           1    0    1
> attr(,"assign")
> [1] 0 1 1
> attr(,"contrasts")
> attr(,"contrasts")$fac
> [1] "contr.treatment"
> 
>> model.matrix(lm(seq(15) ~ fac)) # same
>    (Intercept) fac2 fac3
> 1            1    0    0
> 2            1    0    0
> 3            1    0    0
> 4            1    0    0
> 5            1    0    0
> 6            1    1    0
> 7            1    1    0
> 8            1    1    0
> 9            1    1    0
> 10           1    1    0
> 11           1    0    1
> 12           1    0    1
> 13           1    0    1
> 14           1    0    1
> 15           1    0    1
> attr(,"assign")
> [1] 0 1 1
> attr(,"contrasts")
> attr(,"contrasts")$fac
> [1] "contr.treatment"
> 
>> # I was expecting the first one to give me this
>> options(contrasts = c("contr.sum", "contr.poly"))
>> model.matrix(lm(seq(15) ~ fac))
>    (Intercept) fac1 fac2
> 1            1    1    0
> 2            1    1    0
> 3            1    1    0
> 4            1    1    0
> 5            1    1    0
> 6            1    0    1
> 7            1    0    1
> 8            1    0    1
> 9            1    0    1
> 10           1    0    1
> 11           1   -1   -1
> 12           1   -1   -1
> 13           1   -1   -1
> 14           1   -1   -1
> 15           1   -1   -1
> attr(,"assign")
> [1] 0 1 1
> attr(,"contrasts")
> attr(,"contrasts")$fac
> [1] "contr.sum"
> 
>> R.version.string
> [1] "R version 2.10.1 (2009-12-14)"
>> win.version()
> [1] "Windows Vista (build 6002) Service Pack 2"
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list