[R] explanation of lm's coefficients

Justin Fay jfay at genetics.wustl.edu
Sun Aug 24 00:44:19 CEST 2003


I don't understand the coefficients returned from the lm function. I 
expected these to be the mean values for each factor in the model. Given 
this data and model:

data<-c(rnorm(10,mean=0,sd=1),rnorm(10,mean=1,sd=1),rnorm(10,mean=-.5,sd=1))
ftr<-as.factor(rep(1:3,each=10))
fit<-lm(data ~ ftr)

the mean values of the three facotrs from the data:

c(mean(data[1:10]), mean(data[11:20]), mean(data[21:30]))
[1] -0.3589049  0.6034931 -0.7256897

are not the same as the coefficients return from fit:

fit$coef
(Intercept)        ftr2        ftr3
 -0.3589049   0.9623980  -0.3667847

ftr2 and ftr3 are offset by the value of the intercept.

The fitted values are as I expected:

fit$fitted.values
         1          2          3          4          5          6          7
-0.3589049 -0.3589049 -0.3589049 -0.3589049 -0.3589049 -0.3589049 -0.3589049
         8          9         10         11         12         13         14
-0.3589049 -0.3589049 -0.3589049  0.6034931  0.6034931  0.6034931  0.6034931
        15         16         17         18         19         20         21
 0.6034931  0.6034931  0.6034931  0.6034931  0.6034931  0.6034931 -0.7256897
        22         23         24         25         26         27         28
-0.7256897 -0.7256897 -0.7256897 -0.7256897 -0.7256897 -0.7256897 -0.7256897
        29         30
-0.7256897 -0.7256897

My goal is to get the mean values of the factors. Although easily done, 
I don't understand why the ftr2 and ftr3 are offset by the value of the 
intercept. Any explanations would be appreciated.

Justin

________________________________________
Justin Fay
Assistant Professor of Genetics
Washington University School of Medicine
4566 Scott Ave, St. Louis, MO 63110
PH: 314.747.1808 Fax: 314.362.7855




More information about the R-help mailing list