[R] can I do this with R?

Lucke, Joseph F Joseph.F.Lucke at uth.tmc.edu
Thu May 29 16:11:45 CEST 2008


Frank, I believe, is correct.  Using the AIC/BIC for data-driven model selection does NOT solve the "stepwise problem".  This is because the distribution of the sample AIC is changed from its original distribution to an extreme-value distribution, e.g.., min (AIC1, AIC2, ..., AICn). Thus, whatever properties are touted for the AIC cannot be assumed to apply to the stepwise AIC.  Of course, this same issue applies to R2, p-values, and so forth.

Nonetheless, I use stepAIC with BIC penalty or CAIC penalty as the best of the bad.

Joseph F. Lucke, PhD
Biostatistician
Center for Clinical Research and Evidence-based Medicine
University of Texas Medical School at Houston
Email: Joseph.F.Lucke at uth.tmc.edu
 


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Xiaohui Chen
Sent: Wednesday, May 28, 2008 5:20 PM
To: Frank E Harrell Jr
Cc: r-help at r-project.org
Subject: Re: [R] can I do this with R?

step or stepAIC functions do the job. You can opt to use BIC by changing the mulplication of penalty.

I think AIC and BIC are not only limited to compare two pre-defined models, they can be used as model search criteria. You could enumerate the information criteria for all possible models if the size of full model is relatively small. But this is not generally scaled to practical high-dimensional applications. Hence, it is often only possible to find a 'best' model of a local optimum, e.g. measured by AIC/BIC.

On the other way around, I wouldn't like to say the over-penalization of BIC. Instead, I think AIC is usually underpenalizing larger models in terms of the positive probability of incoperating irrevalent variables in linear models.

X

Frank E Harrell Jr 写道:
> Smita Pakhale wrote:
>> Hi Maria,
>>
>> But why do you want to use forwards or backwards methods? These all 
>> are 'backward' methods of modeling.
>> Try using AIC or BIC. BIC is much better than AIC.
>> And, you do not have to believe me or any one else on this.
>
> How does that help? BIC gives too much penalization in certain 
> contexts; both AIC and BIC were designed to compare two pre-specified 
> models. They were not designed to fix problems of stepwise variable 
> selection.
>
> Frank
>
>>
>> Just make a small data set with a few variables with known 
>> relationship amongst them. With this simulated data set, use all your 
>> modeling methods: backwards, forwards, AIC, BIC etc and then see 
>> which one gives you a answer closest to the truth. The beauty of 
>> using a simulated dataset is that, you 'know' the truth, as you are 
>> the 'creater' of it!
>>
>> smita
>>
>> --- Charilaos Skiadas <cskiadas at gmail.com> wrote:
>>
>>> A google search for "logistic regression with stepwise forward in r" 
>>> returns the following post:
>>>
>>>
>> https://stat.ethz.ch/pipermail/r-help/2003-December/043645.html
>>> Haris Skiadas
>>> Department of Mathematics and Computer Science Hanover College
>>>
>>> On May 28, 2008, at 7:01 AM, Maria wrote:
>>>
>>>> Hello,
>>>> I am just about to install R and was wondering
>>> about a few things.
>>>> I have only worked in Matlab because I wanted to
>>> do a logistic
>>>> regression. However Matlab does not do logistic
>>> regression with
>>>> stepwiseforward method. Therefore I thought about
>>> testing R. So my
>>>> question is
>>>> can I do logistic regression with stepwise forward
>>> in R?
>>>> Thanks /M
>>> ______________________________________________
>>
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list