[R] step, leaps, lasso, LSE or what?

Jari Oksanen jarioksa at sun3.oulu.fi
Fri Mar 1 12:03:32 CET 2002


ripley at stats.ox.ac.uk said:
> A second difference is the purpose of selecting a model.  AIC is
> intended to select a model which is large enough to include the `true'
> model, and hence to give good predictions.  There over-fitting is not
> a real problem. (There are variations on AIC which do not assume some
> model considered is true.)   This is a different aim from trying to
> find the `true' model or trying to find the smallest adequate model,
> both aims for explanation not prediction. 

This may be a stupid question, but perhaps I won't be lashed if I
confess my stupidity as a preventive measure. About minimal adequate
model*s*:  Murray Aitkin et al. have a book called "Statistical
Modelling in GLIM" (Ox UP, 1989) where they tell how to find a set
adequate models in glm (with GLIM), and how one or *several* of these
adequate models may be minimal.  When I read the book, I found this as
an attractive concept since it showed that you may have several about
equally good models with different terms, although usual selection
procedures (including best subsets) finds only one.  I have quite often
seen people to use automatic selection in several subsets and then
saying that subsets are different because different regressors were
selected -- although the same regressors could have been about as good,
but they were never evaluated.

Now the question: Aitkin's procedure would be very easy to perform in R 
(well, it was easy even in dear old GLIM!), but I have hardly seen it 
used. Is there a reason for this? Is there something dubious in minimal 
adequate modles that makes tehm a no-no, an Erlkönig that catches us 
innocent children?

Bibliographic note: I know the procedure from the Aitkin et. al. book,
and haven't checked the original references. These sources are cited in
the book:

Aitkin, M. A. 1974. Simultaneous inference and choice of variable 
subsets in multiple regression. Technometrics 16, 221--227.

Aitkin, M. A. 1978. The analysis of unbalanced cross-classification 
(with Discussion). J. Roy. Stat. Soc. A 141, 
195--223.

cheers, jari oksanen
-- 
Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland
Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061
email jari.oksanen at oulu.fi, homepage http://cc.oulu.fi/~jarioksa/


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list