[R] Automatic creation of binary logistic models

Paul Smith phhs80 at gmail.com
Fri Aug 5 00:33:02 CEST 2011


On Thu, Aug 4, 2011 at 9:35 PM, Marc Schwartz <marc_schwartz at me.com> wrote:
>> Suppose that you are trying to create a binary logistic model by
>> trying different combinations of predictors. Has R got an automatic
>> way of doing this, i.e., is there some way of automatically generating
>> different tentative models and checking their corresponding AIC value?
>> If so, could you please direct me to an example?
>
> Hi Paul,
>
> If it were not for JSS going on at the moment, you would likely get a reply from Frank Harrell telling you why using this approach is not a good idea. This is tantamount to using a stepwise approach with variables going in and out of the model, based upon either AIC or perhaps Wald p values.
>
> If you search the R list archives using rseek.org with keywords such as "stepwise regression Harrell", you will see a plethora of discussions on this over the years.
>
> You might want to obtain a copy of Frank's book Regression Modeling Strategies along with Ewout Steyerberg's book Clinical Prediction Models, which cover this topic and offer alternative solutions to model development. These generally include the pre-specification of full models, considering how many covariate degrees of freedom you can reasonably include in the model and applying shrinkage/penalization.
>
> If you need to engage in data reduction, you might want to consider using the LASSO, as implemented in the glmnet package on CRAN. More information on this method is available at: http://www-stat.stanford.edu/~tibs/lasso.html. An alternative might be backward elimination, which Frank does touch on and covers in:
>
>  http://biostat.mc.vanderbilt.edu/wiki/pub/Main/RmS/rms.pdf
>
> which is a supplement to his course.
>
> Automated creation of models ignores the expertise of both the statistician and subject matter experts, to the detriment of inference.

Thanks, Marc, for your very useful reply.

Paul



More information about the R-help mailing list