[R] Levels and GLM

Kuhn, Max Max.Kuhn at pfizer.com
Fri Jul 7 22:10:42 CEST 2006


jdrapp,

By default, R fits full rank models. If you are coming from SAS, you're 
probably used to less than full rank model parameterizations. 

>From Section 11.1.1 of "An Introduction to R" at

http://cran.r-project.org/doc/manuals/R-intro.html#Contrasts

there is this:

 "What about a k-level factor A? The answer differs for unordered and 
 ordered factors. For unordered factors k - 1 columns are generated 
 for the indicators of the second, ..., kth levels of the factor. 
 (Thus the implicit parameterization is to contrast the response at 
 each level with that at the first.)"

So level "M" is the "reference cell". Assuming that 
data.logistic$Overall is continuous, the intercept is the estimate of 
the mean response when maj = "M" and data.logistic$Overall = 0. The 
estimate for majN is the difference between the reference cell
(estimated 
by the intercept) and the mean response when maj = "N" and 
data.logistic$Overall = 0.

You should check out ?model.matrix and ?contrasts.

Max


> I am using the as.factor command to use with glm.  When I use the
command
> 
> >maj <- as.factor(data.logistic$Majors)
> >maj
> 
> I receive the following output:
>   [1] M M N M M M M N N M M M N M M M M M M M M M M M N M N N M M N M
> M N M M M M M
>  [40] N M N M M N M M M N M N M N M N N N M N M M M M M M N M N M M M
> M M N N M M M
>  [79] M M M N N M M N M N M M M M M M M M M M M M M M M N M M M M M N
> M M M M M N M
> [118] M M M N M N N M M M M M M M M N M N M M M M M N M M M M N M M M
> N N M M M N M
> [157] M M M M M M M M M M M M M N M M N N M M N M M M M M M M M M M M
> M M N M N M M
> [196] M N M M M M M M M M N M M M M M M M M N M M M M M M M M M M M M
> M M N M M N N
> [235] M M M M M N M M M M M M N N M M N M M M M M M M M M M M M M M M
> M N M M M M N
> [274] N M M M M M M N M M M M M M M M M M N N M N M M M M M M M M M M
> N M N N M M M
> [313] M M M M M M M N M M M M M N M M M M M M M M M M M M M M M N M M
> M M M M M N M
> [352] M N M N M M N M M M M N M M M M M M M M M M N M M N N
> Levels: M N
> 
> When I enter:
> 
> > logistic.glm <- glm(data.logistic$X100.Yard.Average ~
data.logistic$Overall + maj, family=binomial)
> > logistic.glm
> 
> I receive the following output:
> 
> Call:  glm(formula = data.logistic$X100.Yard.Average ~
> data.logistic$Overall +      maj, family = binomial)
> 
> Coefficients:
>           (Intercept)  data.logistic$Overall                   majN
>               2.38819               -0.02718               -0.18385
> 
> Degrees of Freedom: 377 Total (i.e. Null);  375 Residual
> Null Deviance:	    514.5
> Residual Deviance: 410.7 	AIC: 416.7
> 
> My question:  Why is there no output for majM?  Any help would be
> greatly appreciated
----------------------------------------------------------------------
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}



More information about the R-help mailing list