[R] discrepancy in the result of R and SAS on same data in logistics regression

Atul Malik a.malik at decisioncraft.com
Fri Oct 5 07:39:00 CEST 2007


Dear Members,

Greetings!

I have come across a discrepancy shown by R and SAS results on same data for logistics regression.. 

When I processed the above csv file(1000.csv) for predicting the Action (i/c) by Age Group(1-7,Na) and Gender(M,F,Na) with GLM of R I get: 

R result

Call:
glm(formula = Action ~ Gender + AgeGroup, family = binomial, 
    data = mydata1, na.action = na.pass)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-1.828  -0.973  -0.709   1.087   1.734  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)   1.2939     0.3180   4.069 4.73e-05 ***
GenderM      -0.8794     0.1637  -5.371 7.85e-08 ***
GenderNa     -1.4407     0.2749  -5.240 1.60e-07 ***
AgeGroup2    -1.2053     0.3971  -3.035  0.00240 ** 
AgeGroup3    -1.6670     0.3262  -5.110 3.21e-07 ***
AgeGroup4    -1.0786     0.3714  -2.904  0.00368 ** 
AgeGroup5    -0.8232     0.3829  -2.150  0.03156 *  
AgeGroup6     0.1682     0.3501   0.481  0.63081    
AgeGroup7    -0.3361     0.3617  -0.929  0.35281    
AgeGroupNa   -1.7956     0.3433  -5.231 1.69e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1342.7  on 999  degrees of freedom
Residual deviance: 1213.2  on 990  degrees of freedom
AIC: 1233.2

Number of Fisher Scoring iterations: 4

where as SAS gives on same data:

      Analysis of Maximum Likelihood Estimates
     
      Parameter
      
     Action
     DF
     Estimate
     Standard
      Error
     Wald
      Chi-Square
     Pr > ChiSq
     
      Intercept
      
     c
     1
     0.3217
     0.0953
     11.4025
     0.0007
     
      AgeGroup
     2
     c
     1
     0.3631
     0.2434
     2.2260
     0.1357
     
      AgeGroup
     3
     c
     1
     0.8248
     0.1411
     34.1508
     <.0001
     
      AgeGroup
     4
     c
     1
     0.2364
     0.2146
     1.2136
     0.2706
     
      AgeGroup
     5
     c
     1
     -0.0190
     0.2299
     0.0068
     0.9343
     
      AgeGroup
     6
     c
     1
     -1.0104
     0.1822
     30.7454
     <.0001
     
      AgeGroup
     7
     c
     1
     -0.5061
     0.1974
     6.5711
     0.0104
     
      AgeGroup
     Na
     c
     1
     0.9534
     0.1718
     30.7884
     <.0001
     
      Gender
     M
     c
     1
     0.1060
     0.1103
     0.9246
     0.3363
     
      Gender
     N
     c
     1
     0.6674
     0.1686
     15.6744
     <.0001
     




I compared the resultant probabilities of Action "c" on all three packages: R, SAS and StatGraphics and found that R and StatGraphics have same results but SAS has different results for some combinations of AgeGroup and Gender as in attached document for probability of Action.


I will appreciate if you can help me sorting out the issue.

Thanks and Best Regards
Atul Malik

StatGraphics results as follows:

Estimated Regression Model (Maximum Likelihood)

       
      
     Standard
     Estimated
     
      Parameter
     Estimate
     Error
     Odds Ratio
     
      CONSTANT
     -1.94239
     0.298622
      
     
      AgeGroup=1
     1.79555
     0.343277
     6.02282
     
      AgeGroup=2
     0.590229
     0.316943
     1.8044
     
      AgeGroup=3
     0.128605
     0.216341
     1.13724
     
      AgeGroup=4
     0.716996
     0.288917
     2.04827
     
      AgeGroup=5
     0.972326
     0.30544
     2.64409
     
      AgeGroup=6
     1.9638
     0.262721
     7.12638
     
      AgeGroup=7
     1.45945
     0.275966
     4.3036
     
      Gender=F
     1.44072
     0.274922
     4.22375
     
      Gender=M
     0.56134
     0.256286
     1.75302
     

 

Analysis of Deviance

      Source
     Deviance
     Df
     P-Value
     
      Model
     129.506
     9
     0.0000
     
      Residual
     1213.21
     990
     0.0000
     
      Total (corr.)
     1342.71
     999
      
     









More information about the R-help mailing list