[Rd] model.matrix bug? Nested factor yields singular design matrix.
Steven McKinney
smckinney at bccrc.ca
Thu May 17 20:41:02 CEST 2007
Hi all,
I believe this is a bug in the model.matrix function.
I'd like a second opinion before filing a bug report.
If I have a nested covariate B with multiple values for
just one level of A, I can not get a non-singular design
matrix out of model.matrix
> df <- data.frame(A = factor(c("a", "a", "x", "x"), levels = c("x", "a")),
+ B = factor(c("b", "x", "x", "x"), levels = c("x", "b")))
>
> df
A B
1 a b
2 a x
3 x x
4 x x
So of course the full design matrix is singular,
this is expected.
> model.matrix(~ A * B, df)
(Intercept) Aa Bb Aa:Bb
1 1 1 1 1
2 1 1 0 0
3 1 0 0 0
4 1 0 0 0
attr(,"assign")
[1] 0 1 2 3
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
attr(,"contrasts")$B
[1] "contr.treatment"
I'd like to drop the B main effect column,
but get the unexpected result of a column of zeroes.
> model.matrix(~ A * B - B, df)
(Intercept) Aa Ax:Bb Aa:Bb
1 1 1 0 1
2 1 1 0 0
3 1 0 0 0
4 1 0 0 0
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
attr(,"contrasts")$B
[1] "contr.treatment"
>
>
This does not happen in S-PLUS.
> info()
S info file C:\DOCUME~1\kilroy\LOCALS~1\Temp\S0107EB.tmp will be removed at session end
$Sinfo:
Enterprise Developer Version 7.0.6 for Microsoft Windows : 2005
SHOME: C:/PROGRAMFILES/INSIGHTFUL/splus70
prog.name: SPLUS.EXE
load.date: Sun Dec 04 23:15:59 2005
date: Thu May 17 07:38:16 PDT 2007
> options(contrasts = c("contr.treatment", "contr.poly"))
> df <- data.frame(A = factor(c("a", "a", "x", "x"), levels = c("x", "a")),
+ B = factor(c("b", "x", "x", "x"), levels = c("x", "b")))
> model.matrix(~ A * B - B, df)
(Intercept) A A:B
1 1 1 1
2 1 1 0
3 1 0 0
4 1 0 0
>
This is what I was expecting to get in R, but can not.
Alternate specifications in R continue to yield a singular
design matrix
>
> model.matrix(~ A/B, df)
(Intercept) Aa Ax:Bb Aa:Bb
1 1 1 0 1
2 1 1 0 0
3 1 0 0 0
4 1 0 0 0
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
attr(,"contrasts")$B
[1] "contr.treatment"
> model.matrix(~ A + A:B, df)
(Intercept) Aa Ax:Bb Aa:Bb
1 1 1 0 1
2 1 1 0 0
3 1 0 0 0
4 1 0 0 0
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$A
[1] "contr.treatment"
attr(,"contrasts")$B
[1] "contr.treatment"
Why is the Ax:Bb column being included?
Have I missed a control parameter or some other way
of specifying to model.matrix not to include this
extra column?
Any feedback appreciated.
Best regards
Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre
email: smckinney at bccrc.ca
tel: 604-675-8000 x7561
BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada
More information about the R-devel
mailing list