[R] Gam() function in R

Simon Wood simon at stats.gla.ac.uk
Mon Dec 6 12:41:07 CET 2004


> this subject is very intersting for me. I'm using mgcv 0.8-9 with R
> version 1.7.1. i didn't know that there was an another gam version with
> package library(gam). Someone can tell me the basics differences between
> them? I look for an help page on google but i only find "mgcv" help
> pages.

- I think you'd need to move to a newer version of R in order to use 
package gam, but that would also let you use a much more recent version of 
package mgcv. 

- package gam is based very closely on the GAM approach presented in 
Hastie and Tibshirani's  "Generalized Additive Models" book. Estimation is 
by back-fitting and model selection is based on step-wise regression 
methods based on approximate distributional results. A particular strength 
of this approach is that local regression smoothers (`lo()' terms) can be 
included in GAM models.

- gam in package mgcv represents GAMs using penalized regression splines. 
Estimation is by direct penalized likelihood maximization with 
integrated smoothness estimation via GCV or related criteria (there is 
also an alternative `gamm' function based on a mixed model approach). 
Strengths of the this approach are that s() terms can be functions of more 
than one variable and that tensor product smooths are available via te() 
terms - these are useful when different degrees of smoothness are 
appropriate relative to different arguments of a smooth. 

Here's an attempt at a summary of the differences:

Estimation: gam::gam based on backfitting, mgcv::gam based on direct 
penalized likelihood maximization (with smoothness estimation integrated)

Model selection: package(gam) based on stepwise regression methods. 
mgcv::gam based on integrated GCV estimation of degree of smoothness.

Smooth terms: gam::gam can represent smooth terms using a very wide range 
of scatterplot smoothers incuding loess, which is built in. mgcv::gam is 
restricted to smoothers that can be represented using basis functions and 
an associated ``wiggliness'' penalty, but these include low rank thin 
plate spline smoothers and tensor product smoothers for smooths of more 
than one variable. Both packages provide interfaces for adding new classes 
of smoother. 

Uncertainty estimation: since mgcv GAMs explicitly estimate 
coefficients for each smooth term, it is fairly straightforward to obtain 
a covariance matrix for the model coefficients, which makes further 
variance calcualtions easy. For example predictions with standard errors 
are easily obtained for predictions made with new prediction data. The 
backfitting approach makes variance calculation more difficult (e.g. at 
present s.e.s are not available from gam::predict.gam with new data)

Interface: both packages are based on Trevor Hastie's Chapter 7 of 
Chambers and Hastie. Since Trevor H. wrote package(gam) it's a closer 
implementation than package(mgcv). 

Basically, if you want integrated smoothness selection, an underlying 
parametric representation, or want smooth interactions in your models 
then mgcv is probably worth a try (but I would say that). If you want to 
use local regression smoothers and/or prefer the stepwise selection 
approach then package gam is for you. 

Simon

_____________________________________________________________________
> Simon Wood simon at stats.gla.ac.uk        www.stats.gla.ac.uk/~simon/
>>  Department of Statistics, University of Glasgow, Glasgow, G12 8QQ
>>>   Direct telephone: (0)141 330 4530          Fax: (0)141 330 4814




More information about the R-help mailing list