[R] lm design matrix bug?

toby909 at gmail.com toby909 at gmail.com
Mon Oct 29 07:59:02 CET 2007


Hi All

Maybe I dont understand it, but I would have expected that the design matrix has 
as many rows as there were observations available to fit the model.
Below a small artificial dataset created, then one model fitted and the design 
matrix outputted, having 27 rows. Then I delete 6 obs, and fit the model on 
these 21 obs, but the design matrix that comes out has 26 rows?

Thanks for your enlightenment.

Toby




y = c()
x1 = c()
x2 = c()
idx = 1
for (i in 1:3) {
  for (j in 1:3) {
   for (k in 1:3) {
    y[idx] = 30*i+10*j+100*i*j+30*k-60
    x1[idx] = i
    x2[idx] = j
    idx = idx+1
   }
  }
}

lm11 = lm(y ~ factor(x1)*factor(x2), x=1)
summary(lm11)
unique(predict(lm11))

X = lm11$x; X

P = solve(t(X)%*%X) %*% t(X); round(P,3)


y[3] = NA
y[6] = NA
y[12] = NA
y[18] = NA
y[24] = NA
y[27] = NA

lm21 = lm(y ~ factor(x1)*factor(x2), x=1)
summary(lm21)
unique(predict(lm21))

X = lm21$x; X

P = solve(t(X)%*%X) %*% t(X); round(P,3)



More information about the R-help mailing list