[R] Logistic Regression - Variable Selection Methods With Prediction

Marc Schwartz marc_schwartz at me.com
Wed Oct 26 19:51:59 CEST 2011


The reason that you are not likely getting replies is that what you propose to do is considered a poor way of building models. 

You need to get out of the "SAS Mindset".

I would suggest you obtain a copy of Frank Harrell's book:

  http://www.amazon.com/exec/obidos/ASIN/0387952322/

and then consider using his 'rms' package on CRAN to engage in modeling building strategies and validation.

Regards,

Marc Schwartz

On Oct 26, 2011, at 11:35 AM, RAJ wrote:

> Can I atleast get help with what pacakge to use for logistic
> regression with all possible models and do prediction. I know i can
> use regsubsets but i am not sure if it has any prediction functions to
> go with it.
> 
> Thanks
> 
> On Oct 25, 6:54 pm, RAJ <dheerajathr... at gmail.com> wrote:
>> Hello,
>> 
>> I am pretty new to R, I have always used SAS and SAS products. My
>> target variable is binary ('Y' and 'N') and i have about 14 predictor
>> variables. My goal is to compare different variable selection methods
>> like Forward, Backward, All possible subsests. I am using
>> misclassification rate to pick the winner method.
>> 
>> This is what i have as of now,
>> 
>> Reg <- glm (Graduation ~., DFtrain,family=binomial(link="logit"))
>>                 step <- extractAIC(Reg, direction="forward")
>>                 pred <- predict(Reg, DFtest,type="response")
>>                 mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"})
>> This program actually works but I needed to check to make sure am
>> doing this right. Also, I am getting the same misclassification rates
>> for all different methods.
>> 
>> I also tried to use
>> 
>> Reg <- leaps(Graduation ~., DFtrain)
>>                 pred <- predict(Reg, DFtest,type="response")
>>                 mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"})
>>                 #print(summary(mis))
>> which doesnt work
>> 
>> and
>> 
>> Reg <- regsubsets(Graduation ~., DFtrain)
>>                 pred <- predict(Reg, DFtest,type="response")
>>                 mis <- mean({pred > 0.5} != {DFtest[,"Graduation"] == "Y"})
>>                 #print(summary(mis))
>> 
>> The Regsubsets will work but the 'predict' function does not work with
>> it. Is there any other way to do predictions when using regsubsets
>> 
>> Any help is appreciated.
>> 
>> Thanks,



More information about the R-help mailing list