[R] Stepwise GLM selection by LRT?

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Jul 12 21:33:00 CEST 2007

```On Thu, 12 Jul 2007, Lutz Ph. Breitling wrote:

> Thank you very much for the prompt reply. Seems like I had not fully
> understood what the k-parameter to stepAIC is doing.
> Your suggested approach looks indeed fine to me, actually I do not
> quite understand why you say that it's only an approximation to the
> LRT?

So this is computing AIC_k = -2L + kp.  If you compare models with p and
p+q parameters, this is equvalent to comparing 2 log LR with kq and so for
q=1 the Wilks' LRT is found for k = qchisq(1-p, df=1) (which is just a
squared Normal).

However, no one said q would always be one, and stepAIC steps in terms,
not individual coefficients.  Therein lies one of the approximations
(another is in the asympototic distribution theory of the test).

> Best wishes-
> Lutz
>
>> Check out the stepAIC function in MASS package.  This is a nice tool, where
>> you can actually implement any penalty even though the function's name has
>> "AIC" in it because it is the default.  Although this doesn't do an LRT test
>> based variable selection, you can sort of approximate it by using a penalty
>> of k = qchisq(1-p, df=1), where p is the p-value for variable selection.
>> This penalty means that a variable enters/exits an existing model, when the
>> absolute value of change in log-likelihood is greater than qchisq(1-p,
>> df=1). For p = 0.1, k = 2.71, and for p=0.05, k = 3.84.  Is this whhant
>> you'd like to do?
>>
>> Ravi.
>>
>>
>> -----Original Message-----
>> From: r-help-bounces at stat.math.ethz.ch
>> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Lutz Ph. Breitling
>> Sent: Wednesday, July 11, 2007 3:06 PM
>> To: r-help at stat.math.ethz.ch
>> Subject: [R] Stepwise GLM selection by LRT?
>>
>> Dear List,
>>
>> having searched the help and archives, I have the impression that
>> there is no automatic model selection procedure implemented in R that
>> includes/excludes predictors in logistic regression models based on
>> LRT P-values. Is that true, or is someone aware of an appropriate
>> function somewhere in a custom package?
>>
>> Even if automatic model selection and LRT might not be the most
>> appropriate methods, I actually would like to use these in order to
>> simulate someone else's modeling approach...
>>
>> Many thanks for all comments-
>> Lutz
>> -----
>> Lutz Ph. Breitling
>> German Cancer Research Center
>> Heidelberg/Germany
>>
>>
>
>

