[R] Singular model.matrix of nested designs

Peng Yu pengyu.ut at gmail.com
Wed Sep 23 01:03:18 CEST 2009


On Tue, Sep 22, 2009 at 5:00 PM, Uwe Ligges
<ligges at statistik.tu-dortmund.de> wrote:
>
>
> Peng Yu wrote:
>>
>> Hi,
>>
>> I want to do ANOVA for nested designs like following. I don't
>> understand why the matrix (t(X) %*% X) is singular. Can somebody help
>> me understand it?
>
>
> Yes:
>
> A=1 is never combined with B=4, B=5, or B=6
> A=2 is never combined with B=1, B=2, or B=3
>
> hence you cannot estimate your model since the corresponding columns in your
> Design matrix are all zero.

I have some additional questions. Please see code below.

Am I coding the nested design correctly? Suppose A factor levels
represent different schools (totally 2 schools), and there are three
instructors (B factor, totally 6 different instructor) in each school
teach the same course, and Y is the performance of the student in each
class.


'X[match(unique(fls),fls),]%*%afit$coefficients' and
'afit$fitted.values[match(unique(fls),fls)]' should give me the same
results, for a non-nested design. However, since there are NA's in
afit$coefficients, 'X[match(unique(fls),fls),]%*%afit$coefficients'
would not work. I am wondering how internally 'aov()' does to compute
'afit$fitted.values' and how 'aov()' deal with NA's?

> a=2
> b=3
> n=4
> A = as.vector(sapply(1:a,function(x){rep(x,b*n)}))
> B = as.vector(sapply(1:(a*b), function(x){rep(x,n)}))
> cbind(A,B)
      A B
 [1,] 1 1
 [2,] 1 1
 [3,] 1 1
 [4,] 1 1
 [5,] 1 2
 [6,] 1 2
 [7,] 1 2
 [8,] 1 2
 [9,] 1 3
[10,] 1 3
[11,] 1 3
[12,] 1 3
[13,] 2 4
[14,] 2 4
[15,] 2 4
[16,] 2 4
[17,] 2 5
[18,] 2 5
[19,] 2 5
[20,] 2 5
[21,] 2 6
[22,] 2 6
[23,] 2 6
[24,] 2 6
> Y = A + B + rnorm(a*b*n)
>
> fr = data.frame(Y=Y,A=as.factor(A),B=as.factor(B))
> afit=aov(Y ~ A / B - 1,fr)
> summary(afit)
          Df Sum Sq Mean Sq F value    Pr(>F)
A          2 664.08  332.04 346.533 4.268e-15 ***
A:B        4  45.87   11.47  11.969 6.403e-05 ***
Residuals 18  17.25    0.96
---
Signif. codes:  0 \u2018***\u2019 0.001 \u2018**\u2019 0.01
\u2018*\u2019 0.05 \u2018.\u2019 0.1 \u2018 \u2019 1
> afit$coefficients
        A1         A2      A1:B2      A2:B2      A1:B3      A2:B3      A1:B4
 0.6560352  8.3303605  2.6997913         NA  3.4210004         NA         NA
     A2:B4      A1:B5      A2:B5      A1:B6      A2:B6
-3.1046949         NA -1.0866216         NA         NA
>
> fls<-apply(model.matrix(afit),1,paste,collapse=',')
> unique(fls)
[1] "1,0,0,0,0,0,0,0,0,0,0,0" "1,0,1,0,0,0,0,0,0,0,0,0"
[3] "1,0,0,0,1,0,0,0,0,0,0,0" "0,1,0,0,0,0,0,1,0,0,0,0"
[5] "0,1,0,0,0,0,0,0,0,1,0,0" "0,1,0,0,0,0,0,0,0,0,0,1"
> fr[match(unique(fls),fls),]
           Y A B
1  0.4981167 1 1
5  3.3801548 1 2
9  5.0535424 1 3
13 3.3227052 2 4
17 7.2438041 2 5
21 7.4057575 2 6
>
> X=model.matrix(afit)
> X[match(unique(fls),fls),]
   A1 A2 A1:B2 A2:B2 A1:B3 A2:B3 A1:B4 A2:B4 A1:B5 A2:B5 A1:B6 A2:B6
1   1  0     0     0     0     0     0     0     0     0     0     0
5   1  0     1     0     0     0     0     0     0     0     0     0
9   1  0     0     0     1     0     0     0     0     0     0     0
13  0  1     0     0     0     0     0     1     0     0     0     0
17  0  1     0     0     0     0     0     0     0     1     0     0
21  0  1     0     0     0     0     0     0     0     0     0     1
>
> X[match(unique(fls),fls),]%*%afit$coefficients
   [,1]
1    NA
5    NA
9    NA
13   NA
17   NA
21   NA
> afit$fitted.values[match(unique(fls),fls)]
        1         5         9        13        17        21
0.6560352 3.3558266 4.0770357 5.2256655 7.2437388 8.3303605
>




More information about the R-help mailing list