[R] gam and concurvity

Thomas Lumley tlumley at u.washington.edu
Tue Sep 16 16:38:19 CEST 2003


On Tue, 16 Sep 2003, Martin Wegmann wrote:

> Hello,
>
> in the paper "Avoiding the effects of concurvity in GAM's .." of Figueiras et
> al. (2003) it is mentioned that in GLM collinearity is taken into account in
> the calc of se but not in GAM (-> results in confidence interval too narrow,
> p-value understated,  GAM S-Plus version). I haven't found any references to
> GAM and concurvity or collinearity on the R page. And I wonder if the R
> version of Gam differ in this point.

They do.

R gam() uses penalised splines, resulting in an easily managed design
matrix.  S-PLUS gam() uses smoothing splines, and (until recently) there
wasn't any known feasible formula for the standard errors.

However:

1/ `Concurvity' is a serious problem only for a few extreme uses of gam.
Even in the air pollution time series studies that provoked the recent
fuss, there impact is really important only in studies that very
aggressively removed seasonal patterns or in data with huge seasonal
variations (eg inland Canada).

2/  These two cases are precisely the cases where the results are
sensitive to the choice of time scale at which seasonal variation
confounds the association, a choice that is not identifiable from the
data.

3/  Neither S-PLUS or R gam() standard errors incorporate the uncertainty
in an automatically chosen smoothing parameter.

4/ Trevor Hastie and colleagues have written software for calculating
correct standard errors for S-PLUS gam.

	-thomas

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle




More information about the R-help mailing list