[R] Avoiding overfitting in gam(mgcv)

kanno at coastal.t.u-tokyo.ac.jp kanno at coastal.t.u-tokyo.ac.jp
Tue Oct 2 03:21:59 CEST 2007


Dear listers,

I'm using gam(from mgcv) for semi-parametric regression on small datasets(10 to 200 observations), and facing a problem of overfitting.

According to the help, it is suggested to avoid overfitting by inflating the effective degrees of freedom in GCV evaluation with increased "gamma" value(e.g. 1.4). But in my case, it didn't make a significant change in the results.

The only way I've found to suppress overfitting is to set the basis dimension "k" at very low values (3 to 5). However, I don't think this is reasonable because knots selection will then be an important issue.

Is there any other means to avoid overfitting when alalyzing small datasets?

Thank you for your help in advance,
Ariyo Kanno



More information about the R-help mailing list