[R] [Rd] Formulas in gam function of mgcv package

Wed Aug 26 16:09:21 CEST 2009

On Tue, 2009-08-25 at 10:00 +0100, Corrado wrote:
> Dear Gavin / Rlings,
> 
> thanks for your kind answer and sorry for posting to the dev mailing list.
> 
> Concerning the specific of your answer:
> 
> I am working with 6 to 36 covariates, and they are all centred and scaled. I 
> represented the problem with two variables to simplify the question.
> 
> So ideally, the situation is:
> 
> 1) y ~ s(x1) + .... + s(x36)
> 
> vs.
> 
> 2) y~s(x1, .... ,x36)

I think you are pushing things a bit with such a complicated smooth.
You're unlikely to be able to fit that either due to insufficient data
and / or hardware limits on your machine.

I see that Simon has responded to this as well, in a far more
comprehensive and informed manner than I could manage.... So I'll leave
it at that...

> 
> I am trying to build a predictive model. Since the the variables are centred 
> and scaled, I think I need an isotropic smooth. I am also interested in having 
> the interactions between the variables included, that is not a purely additive 
> model.

That sounds a bit like data fishing; throw everything into the pot and
see what comes out of it.

<snip />
> I have also some difficulties in understanding the values you have chosen for k 
> in the first example (why 60?).

Sorry, that was a complication on my part. The main point was to show
that you need to try to get the same bases used in the s(x1) and s(x2)
parts of the formula; So if you had this model

y ~ s(x1, k = 20) + s(x2, k = 20)

You need something like 

y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2)

[If you wanted the bivariate smooth to be more complicated than the
default in mgcv, then you might have done:

y ~ s(x2, x2, k = 60) ## for example

in which case you could fit that model as 

y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2, k = 60)

] That was where the k = 60 came from, but in simplifying my response I
forgot to remove it.

Simon has since provided a more thorough response (Thanks Simon).

HTH

G

> 
> Thanks
> 
> Best,
> 
> 
> 
> On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
> > [Note R-Devel is the wrong list for such questions. R-Help is where this
> > should have been directed - redirected there now]
> >
> > On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
> > > Dear R-experts,
> > >
> > > I have a question on the formulas used in the gam function of the mgcv
> > > package.
> > >
> > > I am trying to understand the relationships between:
> > >
> > > y~s(x1)+s(x2)+s(x3)+s(x4)
> > >
> > > and
> > >
> > > y~s(x1,x2,x3,x4)
> > >
> > > Does the latter contain the former? what about the smoothers of all
> > > interaction terms?
> >
> > I'm not 100% certain how this scales to smooths of more than 2
> > variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
> > Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
> > 2 variables.
> >
> > Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
> > used to produce the smoothers in the two models may not be the same in
> > both models. One option to ensure nestedness is to fit the more
> > complicated model as something like this:
> >
> > ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
> > y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
> >                                   ^^^^^^^^^^^^^^^^^
> > where the last term (^^^ above) has the same k as used in s(x1, x2)
> >
> > Note that these are isotropic smooths; are x1 and x2 measured in the
> > same units etc.? Tensor product smooths may be more appropriate if not,
> > and if we specify the bases when fitting models s(x1) + s(x2) *is*
> > strictly nested in te(x1, x2), eg.
> >
> > y ~ s(x1, bs = "cr", k = 10) + s(x2, bs = "cr", k = 10)
> >
> > is strictly nested within
> >
> > y ~ te(x1, x2, k = 10)
> > ## is the same as y ~ te(x1, x2, bs = "cr", k = 10)
> >
> > [Note that bs = "cr" is the default basis in te() smooths, hence we
> > don't need to specify it, and k = 10 refers to each individual smooth in
> > the te().]
> >
> > HTH
> >
> > G
> >
> > > I have (tried to) read the manual pages of gam, formula.gam,
> > > smooth.terms, linear.functional.terms but could not understand properly.
> > >
> > > Regards
> 
> 
> 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%