[R] which alternative tests instead of AIC/BIC for choosing models

Thu Aug 21 14:24:29 CEST 2008

Hi Ben,

Try the following reference:

Implementing Statistical Criteria To Select Return Forecasting Models:
What do We Learn? By Peter Bossaerts and Pierre Hillion, Review of
Financial Studies, Vol. 12, No. 2.

I have created an R function which implements Bossearts and Hillion's
methodologies. If you need it, I will more than happy to post them
online.   

Please let me know if you need additional information. 

Kind Regards,

Pedro N. 

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Ben Bolker
Sent: Wednesday, August 13, 2008 5:38 PM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] which alternative tests instead of AIC/BIC for
choosingmodels

> > Dear R Users,
> >
> > I am looking for an alternative to AIC or BIC to choose model
parameters.
> > This is somewhat of a general statistics question, but I ask it in
this
> > forum as I am looking for a R solution.
> >
> > Suppose I have one dependent variable, y, and two independent
variables,
> > x1 an x2.
> >
> > I can perform three regressions:
> > reg1: y~x1
> > reg2: y~x2
> > reg3: y~x1+x2
> >
> > The AIC of reg1 is 2000, reg2 is 1000 and reg3 is 950. One would,
> > presumably, conclude that one should use both x1 and x2.  However,
the
> > R^2's are quite different: R^2 of reg1 is 0.5%, reg2 is 95% and reg3
is
> > 95.25%. Knowing that, I would actually conclude that x1 adds litte
and
> > should probably not be used.
> >
> > There is the overall question of what potentially explains this
outcome,
> > i.e. the reduction in AIC in going from reg2 to reg3 even though R^2
does
> > not materially improve
> > with the addition of x1 to reg 2 (to get to reg3). But that is more
of a
> > generic statistics issue and not my question here.
> >

  I know you didn't ask the "generic statistics question", but
I think it's fairly important.  I suspect the reason that
you're getting (what you consider to be) a "spurious" result
that includes x1, or equivalently that your delta-AICs are
so big, is that you have a huge data set.  Lindsey (p. 15)
talks a bit about calibration that changes with the size of 
the data set.

  Model 3 will very probably give you better predictive power
than model 2.  If you want to select on the basis of improvement
in R^2, why not just do that?

  Ben Bolker

Lindsey, J. K. 1999. Some Statistical Heresies. The Statistician 48, no.
1: 1-40.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.