[R] How to compare GLM and GAM models

Wed Jun 20 12:17:27 CEST 2007

On Wednesday 20 June 2007 10:34, Prof Brian Ripley wrote:
> On Tue, 19 Jun 2007, Ben Bolker wrote:
> > Yuanchang xie <xieyc <at> hotmail.com> writes:
> >> Dear Listers,
> >>
> >> I want to compare two negative binomial models fitted using glm.nb and
> >> gam(mgcv) based on the same data. What would be the most appropriate
> >> criteria to compare these two models? Can someone point me to some
> >> references? Thank you very much.
> >>
> >> Yuanchang Xie
> >
> >  Since they can't possibly be nested I would suggest AIC.
>
> Surely they could be: a smooth fit in gam includes the possibility of a
> linear fit.
>
> What is of more concern to me is that gam() is by default itself doing
> model selection, so AIC is not well-defined.  According to ?gam.selection,
> the comparisons are best done by comparing scores within mgcv.

- In the negative binomial case  I'd also be a bit cautious about  AIC  --- 
for the `gam' model the negative binomial `theta' parameter is not an MLE (or 
even penalized MLE): see ?gam.neg.binom for details. That said, comparison of 
GCV scores is definitely not an option: the `theta' estimation method renders 
it meaningless here. 

- Of course if `theta' is known then everything is different. In that case the 
negative binomial gam is the same as any other gam with known scale 
parameter, so the default `mgcv:gam' behaviour will be to do smoothness 
selection using what is actually an approximate AIC. Estimated degrees of 
freedom replace number of parameters in the AIC `penalty' term, something 
which can be justified using a variant of the arguments underpinning the GACV 
methods proposed by Xiang & Wahba (1996, Stat. Sin.) and Gu and Xiang (2001, 
JCGS).  In other (non negative binomial) cases, when the scale parameter is 
unknown,  a variant of GCV is used for smoothness selection. However, 
asymptotically this is equivalent to using the AIC type approach 
(unsurprisingly, see Stone, 1977, JRSSB).

- The upshot of this is that generally I think that AIC (modified to use EDF 
in place of parameter count, as in R) is a reasonable way to compare GAMs --- 
in the known scale parameter case it's equivalent to comparing the scores 
used for smoothness selection, while in the unknown scale parameter case it's 
equivalent to comparing such scores, in the large sample limit. 

- For negative binomial GAMs with unknown theta, I'd still be inclined to use 
`AIC()' as a guide for model selection, but bearing in mind that in that case 
it's an approximation without good supporting theory,  

best,
Simon


-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283