[R] physical constraint with gam
Dominik.Schneider at colorado.edu
Wed May 11 18:11:17 CEST 2016
Hi Simon, Thanks for this explanation.
To make sure I understand, another way of explaining the y axis in my
original example is that it is the contribution to snowdepth relative to
the other variables (the example only had fsca, but my actual case has a
couple others). i.e. a negative s(fsca) of -0.5 simply means snowdepth 0.5
units below the intercept+s(x_i), where s(x_i) could also be negative in
the case where total snowdepth is less than the intercept value.
The use of by=fsca is really useful for interpreting the marginal impact of
the different variables. With my actual data, the term s(fsca):fsca is
never negative, which is much more intuitive. Is it appropriate to compare
magnitudes of e.g. s(x2):x2 / mean(x2) and s(x2):x2 / mean(x2) where
mean(x_i) are the mean of the actual data?
Lastly, how would these two differ: s(x1,by=x2); or s(x1,by=x1)*s(x2,by=x2)
since interactions are surely present and i'm not sure if a linear
combination is enough.
On Wed, May 11, 2016 at 3:11 AM, Simon Wood <simon.wood at bath.edu> wrote:
> The spline having a positive value is not the same as a glm coefficient
> having a positive value. When you plot a smooth, say s(x), that is
> equivalent to plotting the line 'beta * x' in a GLM. It is not equivalent
> to plotting 'beta'. The smooths in a gam are (usually) subject to
> `sum-to-zero' identifiability constraints to avoid confounding via the
> intercept, so they are bound to be negative over some part of the covariate
> range. For example, if I have a model y ~ s(x) + s(z), I can't estimate the
> mean level for s(x) and the mean level for s(z) as they are completely
> confounded, and confounded with the model intercept term.
> I suppose that if you want to interpret the smooths as glm parameters
> varying with the covariate they relate to then you can do, by setting the
> model up as a varying coefficient model, using the `by' argument to 's'...
> this model is `snowdepth_i = f(fsca_i) * fsca_i + e_i' . s(fsca,by=fsca)
> is not confounded with the intercept, so no constraint is needed or
> applied, and you can now interpret the smooth like a local GLM coefficient.
> On 11/05/16 01:30, Dominik Schneider wrote:
>> Just getting into using GAM using the mgcv package. I've generated some
>> models and extracted the splines for each of the variables and started
>> visualizing them. I'm noticing that one of my variables is physically
>> In the example below, my interpretation of the following plot is that the
>> y-axis is basically the equivalent of a "parameter" value of a GLM; in GAM
>> this value can change as the functional relationship changes between x and
>> y. In my case, I am predicting snowdepth based on the fractional snow
>> covered area. In no case will snowdepth realistically decrease for a unit
>> increase in fsca so my question is: *Is there a way to constrain the
>> to positive values? *
>> datplot=cbind(sterms,mdl$model) %>% tbl_df
>> dat=data_frame(snowdepth=runif(100,min =
>> [[alternative HTML version deleted]]
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> PLEASE do read the posting guide
>> and provide commented, minimal, self-contained, reproducible code.
> Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
> +44 (0)117 33 18273 http://www.maths.bris.ac.uk/~sw15190
[[alternative HTML version deleted]]
More information about the R-help