[R] how do remove those predictor which have p value greater than 0.05 in GLM?

shubha shuba.pandit at gmail.com
Mon Nov 22 17:13:32 CET 2010


Hi R user,
I am a kind of an intermediate user of R. Now I am using GLM model (library
MASS, VEGUS). I used  a backward stepwise logistic regression, but i got a
problem in removing  those predictors which are above 0.05. I don't want to
include those variables which were above 0.05 in final backward stepwise
logetsic regression model.

for example: first I run the model,
 "name<-glm(dep~env1+env2..., family= binomial, data=new)"

after that, I did stepwise for name

name.step<-step(name, direction="backward")

here, I still got those variables which were not significant, for example:
secchi was not significant (see below  example), but still it was in the
model. how can I remove those variables which are not significant in
forward/backward stepwise?.

another question, when I wrote direction="backward", I got the results same
as in the process of "forward". It is really strange. why is it same results
for backward and forward.  I checked in other two statistical software
(Statistica and SYSTAT), they provided a correct results, I think. But, I
need to use R for further analysis, therefore I need to fix the problem.  I
am spending so much time to figure it out, but I could not. could you please
give your suggestions. It would be really a great help. please see the
example of retaining predictors which have p value is greater that 0.05
after stepwise logistic regression.

Thank
Shubha Pandit, PhD
University of Windsor
Windsor, ON, Canada
====
 

> summary(step.glm.int.ag1)

Call:
glm(formula = ag1less ~ GEARTEMP + DOGEAR + GEARDEPTH + SECCHI +
    GEARTEMP:SECCHI + DOGEAR:SECCHI + GEARTEMP:DOGEAR + GEARTEMP:GEARDEPTH +
    DOGEAR:GEARDEPTH, family = binomial, data = training)

Deviance Residuals:
    Min       1Q   Median       3Q      Max 
-2.1983  -0.8272  -0.4677   0.8014   2.6502 

Coefficients:
                    Estimate Std. Error z value Pr(>|z|)   
(Intercept)         3.231623   1.846593   1.750 0.080110 . 
GEARTEMP           -0.004408   0.085254  -0.052 0.958761   
DOGEAR             -0.732805   0.182285  -4.020 5.82e-05 ***
GEARDEPTH          -0.249237   0.060825  -4.098 4.17e-05 ***
SECCHI              0.311875   0.297594   1.048 0.294645   
GEARTEMP:SECCHI    -0.080664   0.010079  -8.003 1.21e-15 ***
DOGEAR:SECCHI       0.066555   0.022181   3.000 0.002695 **
GEARTEMP:DOGEAR     0.030988   0.008907   3.479 0.000503 ***
GEARTEMP:GEARDEPTH  0.008856   0.002122   4.173 3.01e-05 ***
DOGEAR:GEARDEPTH    0.006680   0.004483   1.490 0.136151   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3389.5  on 2751  degrees of freedom
Residual devia\
n\
ce: 2720.4  on 2742  degrees of freedom

AIC: 2740.4uh

Number of Fisher Scoring iterations: 6

==========================

> glm.int.ag1<-glm(ag1less~GEARTEMP+DOGEAR+GEARDEPTH+SECCHI+SECCHI*GEARTEMP+SECCHI*DOGEAR+SECCHI*GEARDEPTH+GEARTEMP*DOGEAR+GEARTEMP*GEARDEPTH+GEARDEPTH*DOGEAR,data=training,
> family=binomial)
> summary(glm.int.ag1)

Call:
glm(formula = ag1less ~ GEARTEMP + DOGEAR + GEARDEPTH + SECCHI +
    SECCHI * GEARTEMP + SECCHI * DOGEAR + SECCHI * GEARDEPTH +
    GEARTEMP * DOGEAR + GEARTEMP * GEARDEPTH + GEARDEPTH * DOGEAR,
    family = binomial, data = training)

Deviance Residuals:
    Min       1Q   Median       3Q      Max 
-2.1990  -0.8287  -0.4668   0.8055   2.6673 

Coefficients:
                    Estimate Std. Error z value Pr(>|z|)   
(Intercept)         2.909805   1.928375   1.509 0.131314   
GEARTEMP            0.005315   0.087159   0.061 0.951379   
DOGEAR             -0.721864   0.183708  -3.929 8.52e-05 ***
GEARDEPTH          -0.235961   0.064828  -3.640 0.000273 ***
SECCHI              0.391445   0.326542   1.199 0.230622   
GEARTEMP:SECCHI    -0.082296   0.010437  -7.885 3.14e-15 ***
DOGEAR:SECCHI       0.065572   0.022319   2.938 0.003305 **
GEARDEPTH:SECCHI   -0.003176   0.005295  -0.600 0.548675   
GEARTEMP:DOGEAR     0.030571   0.008961   3.412 0.000646 ***
GEARTEMP:GEARDEPTH  0.008692   0.002159   4.027 5.66e-05 ***
DOGEAR:GEARDEPTH    0.006544   0.004495   1.456 0.145484   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3389.5  on 2751  degrees of freedom
Residual deviance: 2720.0  on 2741  degrees of freedom
AIC: 2742

Number of Fisher Scoring iterations: 6



-- 
View this message in context: http://r.789695.n4.nabble.com/how-do-remove-those-predictor-which-have-p-value-greater-than-0-05-in-GLM-tp3053921p3053921.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list