[R] what is the difference between the two logistic models?

Thu Aug 13 00:04:33 CEST 2009

Hi All,

I have data with 400 individuals and the following information 
Grade: pass or fail  coded as 1 for pass and 0 for fail 
Sex: male or female ( coded as 1 for male and 2 for female ) 
Age 
Teaching.method : can be  1,2,3 

I want to fit a logistic regression where the outcome if (1=pass or 0 for
fail) and the rest of the variables are the regressors. 
My question is that I am not sure when to use “factor” for a variable.

In my example, Grade, sex, teaching method are categorial variables coded as
stated above.
Age is a continuous variable

I have tried the model both ways where in the first model I stick in the
word “factor” in front of the categorial variables, but in this case I do
not know how to interpret the output?

Can someone shed some light on the difference between model1 and model2 and
how to interpret them?

Below is my output

Thanks for your help

Call:
glm(formula = factor(Grade) ~ factor(sex) + age + factor(teaching.method), 
    family = binomial, data = data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.8649  -1.1926   0.7494   1.0091   1.6659  

Coefficients:
                                                Estimate Std. Error z value
Pr(>|z|)    
(Intercept)                            -2.77217    0.82182  -3.373 0.000743
***
factor(sex)2                           -0.34751    0.22960  -1.514 0.130140    
age                                          0.04544    0.01074   4.230
2.34e-05 ***
factor(teaching.method)  2    -0.07125    0.30123  -0.237 0.813023    
factor(teaching.method)3         0.50058    0.33087   1.513 0.130303    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 465.18  on 344  degrees of freedom
Residual deviance: 438.91  on 340  degrees of freedom
AIC: 448.91

Number of Fisher Scoring iterations: 4

> model2<-glm(Grade~ sex + age +teaching.method, family=binomial,data=ndata)
> summary(model2)

Call:
glm(formula = Grade ~ sex + age +teaching.method, family = binomial, 
    data = ndata)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.7959  -1.2122   0.7547   1.0043   1.5791  

Coefficients:
                             Estimate Std. Error z value Pr(>|z|)    
(Intercept)             -2.83988    0.94749  -2.997  0.00272 ** 
sex                        -0.33361    0.22867  -1.459  0.14458    
age                           0.04432    0.01065   4.160 3.18e-05 ***
teaching.method     0.28017    0.16181   1.731  0.08336 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 465.18  on 344  degrees of freedom
Residual deviance: 440.85  on 341  degrees of freedom
AIC: 448.85

Number of Fisher Scoring iterations: 4

-- 
View this message in context: http://www.nabble.com/what-is-the-difference-between-the-two-logistic-models--tp24943440p24943440.html
Sent from the R help mailing list archive at Nabble.com.