[R] How to avoid overfitting in gam(mgcv)

Simon Wood s.wood at bath.ac.uk
Wed Oct 3 09:47:20 CEST 2007


What sort of model structure are you using? In particular what is the response 
distribution? For poisson and binomial then overfitting can be a sign of 
overdispersion and quasipoisson or quasibinomial may be better. Also I would 
not expect to get useful smoothing parameter estimates from 10 data!

best,
Simon

On Wednesday 03 October 2007 06:55, 神野有生 wrote:
> Dear listers,
>
> I'm using gam(from mgcv) for semi-parametric regression on small and
> noisy datasets(10 to 200
> observations), and facing a problem of overfitting.
>
> According to the book(Simon N. Wood / Generalized Additive Models: An
> Introduction with R), it is
> suggested to avoid overfitting by inflating the effective degrees of
> freedom in GCV evaluation with
> increased "gamma" value(e.g. 1.4). But in my case, it didn't make a
> significant change in the
> results.
>
> The only way I've found to suppress overfitting is to set the basis
> dimension "k" at very low values
> (3 to 5). However, I don't think this is reasonable because knots
> selection will then be an
> important issue.
>
> Is there any other means to avoid overfitting when alalyzing small
> datasets?
>
> Thank you for your help in advance,
> Ariyo Kanno
>
> --
> Ariyo Kanno
> 1st-year doctor's degree student at
> Institute of Environmental Studies,
> The University of Tokyo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented, minimal,
> self-contained, reproducible code.

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283 



More information about the R-help mailing list