[R] physical constraint with gam
Dominik Schneider
dosc3612 at colorado.edu
Mon May 16 22:23:01 CEST 2016
Thanks for the clarification!
On Sat, May 14, 2016 at 1:24 AM, Simon Wood <simon.wood at bath.edu> wrote:
> On 12/05/16 02:29, Dominik Schneider wrote:
>
> Hi again,
> I'm looking for some clarification on 2 things.
> 1. On that last note, I realize that s(x1,x2) would be the other obvious
> interaction to compare with - and I see that you recommend te(x1,x2) if
> they are not on the same scale.
>
> - yes that's right, s(x1,x2) gives an isotropic smooth, which is usually
> only appropriate if x1 and x2 are naturally on the same scale.
>
> 2. If s(x1,by=x1) gives you a "parameter" value similar to a GLM when you
> plot s(x1):x1, why does my function above return the same yhat as
> predict(mdl,type='response') ? Shouldn't each of the terms need to be
> multiplied by the variable value before applying
> rowSums()+attr(sterms,'constant') ??
>
> predict returns s(x1)*x1 (plot.gam just plots s(x1), because in general
> s(x1,by=x2) is not smooth). If you want to get s(x1) on its own you need to
> do something like this:
>
> x2 <- x1 ## copy x1
> m <- gam(y~s(x1,by=x2)) ## model implementing s(x1,by=x1) using copy of x1
> predict(m,data.frame(x1=x1,x2=rep(1,length(x2))),type="terms") ## now
> predicted s(x1)*x2 = s(x1)
>
> best,
> Simon
>
>
> Thanks again
> Dominik
>
> On Wed, May 11, 2016 at 10:11 AM, Dominik Schneider <
> <Dominik.Schneider at colorado.edu>Dominik.Schneider at colorado.edu> wrote:
>
>> Hi Simon, Thanks for this explanation.
>> To make sure I understand, another way of explaining the y axis in my
>> original example is that it is the contribution to snowdepth relative to
>> the other variables (the example only had fsca, but my actual case has a
>> couple others). i.e. a negative s(fsca) of -0.5 simply means snowdepth 0.5
>> units below the intercept+s(x_i), where s(x_i) could also be negative in
>> the case where total snowdepth is less than the intercept value.
>>
>> The use of by=fsca is really useful for interpreting the marginal impact
>> of the different variables. With my actual data, the term s(fsca):fsca is
>> never negative, which is much more intuitive. Is it appropriate to compare
>> magnitudes of e.g. s(x2):x2 / mean(x2) and s(x2):x2 / mean(x2) where
>> mean(x_i) are the mean of the actual data?
>>
>> Lastly, how would these two differ: s(x1,by=x2); or
>> s(x1,by=x1)*s(x2,by=x2) since interactions are surely present and i'm not
>> sure if a linear combination is enough.
>>
>> Thanks!
>> Dominik
>>
>>
>> On Wed, May 11, 2016 at 3:11 AM, Simon Wood < <simon.wood at bath.edu>
>> simon.wood at bath.edu> wrote:
>>
>>> The spline having a positive value is not the same as a glm coefficient
>>> having a positive value. When you plot a smooth, say s(x), that is
>>> equivalent to plotting the line 'beta * x' in a GLM. It is not equivalent
>>> to plotting 'beta'. The smooths in a gam are (usually) subject to
>>> `sum-to-zero' identifiability constraints to avoid confounding via the
>>> intercept, so they are bound to be negative over some part of the covariate
>>> range. For example, if I have a model y ~ s(x) + s(z), I can't estimate the
>>> mean level for s(x) and the mean level for s(z) as they are completely
>>> confounded, and confounded with the model intercept term.
>>>
>>> I suppose that if you want to interpret the smooths as glm parameters
>>> varying with the covariate they relate to then you can do, by setting the
>>> model up as a varying coefficient model, using the `by' argument to 's'...
>>>
>>> gam(snowdepth~s(fsca,by=fsca),data=dat)
>>>
>>>
>>> this model is `snowdepth_i = f(fsca_i) * fsca_i + e_i' . s(fsca,by=fsca)
>>> is not confounded with the intercept, so no constraint is needed or
>>> applied, and you can now interpret the smooth like a local GLM coefficient.
>>>
>>> best,
>>> Simon
>>>
>>>
>>>
>>>
>>> On 11/05/16 01:30, Dominik Schneider wrote:
>>>
>>>> Hi,
>>>> Just getting into using GAM using the mgcv package. I've generated some
>>>> models and extracted the splines for each of the variables and started
>>>> visualizing them. I'm noticing that one of my variables is physically
>>>> unrealistic.
>>>>
>>>> In the example below, my interpretation of the following plot is that
>>>> the
>>>> y-axis is basically the equivalent of a "parameter" value of a GLM; in
>>>> GAM
>>>> this value can change as the functional relationship changes between x
>>>> and
>>>> y. In my case, I am predicting snowdepth based on the fractional snow
>>>> covered area. In no case will snowdepth realistically decrease for a
>>>> unit
>>>> increase in fsca so my question is: *Is there a way to constrain the
>>>> spline
>>>> to positive values? *
>>>>
>>>> Thanks
>>>> Dominik
>>>>
>>>> library(mgcv)
>>>> library(dplyr)
>>>> library(ggplot2)
>>>> extract_splines=function(mdl){
>>>> sterms=predict(mdl,type='terms')
>>>> datplot=cbind(sterms,mdl$model) %>% tbl_df
>>>> datplot$intercept=attr(sterms,'constant')
>>>> datplot$yhat=rowSums(sterms)+attr(sterms,'constant')
>>>> return(datplot)
>>>> }
>>>> dat=data_frame(snowdepth=runif(100,min =
>>>> 0.001,max=6.7),fsca=runif(100,0.01,.99))
>>>> mdl=gam(snowdepth~s(fsca),data=dat)
>>>> termdF=extract_splines(mdl)
>>>> ggplot(termdF)+
>>>> geom_line(aes(x=fsca,y=`s(fsca)`))
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> <http://www.R-project.org/posting-guide.html>
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>> --
>>> Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
>>> +44 (0)117 33 18273 <%2B44%20%280%29117%2033%2018273>
>>> <http://www.maths.bris.ac.uk/%7Esw15190>
>>> http://www.maths.bris.ac.uk/~sw15190
>>>
>>>
>>
>
>
> --
> Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
> +44 (0)117 33 18273 http://www.maths.bris.ac.uk/~sw15190
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list