[R] contr.sdif from MASS

Christoph Scherber Christoph.Scherber at agr.uni-goettingen.de
Tue Nov 24 22:52:34 CET 2009


Dear all,

When using contr.sdif (from MASS), is it true that the *Intercept* is the
mean of the response variable *across all levels* of the explanatory
variables included in the model? Somehow, it doesn´t seem to be the
overall mean.

Many thanks for any help!
Christoph



# Below is an example:
# (using R 2.10.0 on Windows XP)


####
options(contrasts=c("contr.sdif","contr.poly"))
test1=lm(response.variable~G,mydata)

                   Estimate Std. Error t value Pr(>|t|)
(Intercept)           347.05      15.23  22.783  < 2e-16 ***
GLevel1-GLevel2       192.61      30.47   6.322 1.43e-08 ***

# so the contr.sdif intercept is
# the MEAN of the response variable *across all levels of G*, right?

#####
options(contrasts=c("contr.treatment","contr.poly"))
test2=lm(response.variable~G,mydata)
summary(test2)
             Estimate Std. Error t value Pr(>|t|)
(Intercept)   250.74      22.45  11.167  < 2e-16 ***
GLevel2       192.61      30.47   6.322 1.43e-08 ***

# With contr.treatment, the intercept obviously is the mean of
# the first level of G.

####
options(contrasts=c("contr.sdif","contr.poly"))
test3=lm(response.variable~G+covariate,mydata)
summary(test3)

                   Estimate Std. Error t value Pr(>|t|)
(Intercept)          362.757     26.586  13.645  < 2e-16 ***
GLevel1-GLevel2      201.557     32.976   6.112 3.63e-08 ***
covariate             -7.424     10.281  -0.722    0.472

# note that 362.757-0.5*201.557=261.9785 (The GLevel1 mean in test4)

####
options(contrasts=c("contr.treatment","contr.poly"))
test4=lm(response.variable~G+covariate,mydata)
summary(test4)

	  	Estimate Std. Error t value Pr(>|t|)
(Intercept)     261.979     27.375   9.570 8.50e-15 ***
GLevel2         201.557     32.976   6.112 3.63e-08 ***
covariate        -7.424     10.281  -0.722    0.472


###

tapply(mydata$response.variable,mydata$G,mean)
GLevel2    Level1
  250.7432   443.3523

mean(tapply(mydata$response.variable,mydata$G,mean))
##  347.0478 #corresponding to the intercept term in test1

with(mydata,mean(tapply(response.variable,list(covariate,G),mean)))
# [1] 363.5611 #this *seems* to be similar to the test3 intercept?!




More information about the R-help mailing list