[R] bug in step()?

Chong Gu chong at stat.purdue.edu
Fri Feb 16 18:09:52 CET 2001


Folks,

There appears to be a bug(?) in step() when used to screen logistic
models.  The problem appears to be specific to 1.2.1 (or maybe also
1.2.0?), as the proper behavior was observed in earlier versions.  When
I did the same on the surrogate log linear model, everything seemed
okey.

The data involved was the detergent data found in earlier editions of
MASS, as given below,

   > detg1
      Temp M.user   Soft  M  X
   1   Low      N   Hard 42 68
   3  High      N   Hard 30 42
   5   Low      Y   Hard 52 37
   7  High      Y   Hard 43 24
   9   Low      N Medium 50 66
   11 High      N Medium 23 33
   13  Low      Y Medium 55 47
   15 High      Y Medium 47 23
   17  Low      N   Soft 53 63
   19 High      N   Soft 27 29
   21  Low      Y   Soft 49 57
   23 High      Y   Soft 29 19

A constant model was fit to the data

   > detg1.m0 <- glm(cbind(X,M)~1,binomial,detg1)
   > detg1.m0

   Call:  glm(formula = cbind(X, M) ~ 1, family = binomial, data = detg1) 

   Coefficients:
   (Intercept)  
       0.01587  

   Degrees of Freedom: 11 Total (i.e. Null);  11 Residual
   Null Deviance:      32.83 
   Residual Deviance: 32.83        AIC: 92.52 

The following was the results I got from R.1.2.1 on both AIX and FreeBSD,

   > step(detg1.m0,scope=list(upper=~M.user*Temp*Soft))
   Start:  AIC= 92.52 
    cbind(X, M) ~ 1 

            Df Deviance     AIC
   <none>         32.83   92.52
   + M.user  1  1059.98 1121.67
   + Temp    1  2236.57 2298.27
   + Soft    2  2565.93 2629.63

   Call:  glm(formula = cbind(X, M) ~ 1, family = binomial, data = detg1) 

   Coefficients:
   (Intercept)  
       0.01587  

   Degrees of Freedom: 11 Total (i.e. Null);  11 Residual
   Null Deviance:	    32.83 
   Residual Deviance: 32.83 	AIC: 92.52 

I then tried stepAIC() in the MASS package, although I was not sure
why it should be any different.  The following was the results,

   > stepAIC(detg1.m0,scope=list(lower=~.,upper=~.+M.user))
   Start:  AIC= 92.52 
    cbind(X, M) ~ 1 

            Df Deviance    AIC
   + M.user  1   12.244 73.942
   <none>        32.826 92.524

   Step:  AIC= 73.94 
    cbind(X, M) ~ M.user 

            Df Deviance    AIC
   - M.user  1    0.428 60.125
   <none>        12.244 73.942

   Step:  AIC= 92.52 
   cbind(X, M) ~ 1 


   Call:  glm(formula = cbind(X, M) ~ 1, family = binomial, data = detg1) 

   Coefficients:
   (Intercept)  
       0.01587  

   Degrees of Freedom: 11 Total (i.e. Null);  11 Residual
   Null Deviance:	    32.83 
   Residual Deviance: 32.83 	AIC: 92.52 
   Warning message: 
   non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) 

The AIC obtained by stepAIC() appeared to be correct in the first step
add1() operation, but the drop1() screwed up in the next step and
brought me right back to the constant model.

Will check on my linux box tonight, but this one might not be
platform-specific as I was able to duplicate on two platforms so far.

Chong
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list