[R] Stepwise logistic regression....take too long...

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Apr 21 08:01:14 CEST 2008


On Sun, 20 Apr 2008, Marko Milicic wrote:

> Dear R helpers,
>
> I'm trying to build logistic regression model large dataset 360 factors and
> 850 observations. All 360 factors are known to be good predictors of outcome
> variable but I have to find best model with maximum 10 factors. I tried to
> fit full model and use stepAIC function to get best model but unfortenatly,
> the process takes too long to complete (more than 4 hours)...
>
> Is it expected behaviour of stepAIC function from MASS package or I'm doing
> something wrong.

Both.  Work out how hany fits you need to do backwards elimination. (It is 
tens of thousands.)

'I have to find best model with maximum 10 factors' looks like a homework 
problem to me.  Where does the round number 10 come from?

Also, unless almost all the 'factors' have only two levels this looks like 
over-fitting for a single model, let alone after model dredging.


>
> Any suggestions?
>
> Thanks
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list