[R] fitting gam with constraints on parameters in mgcv

Chris Wilcox cw||cox @end|ng |rom m|nderoo@org
Sat Oct 22 02:41:12 CEST 2022


Hi all,

I am trying to fit a gam using mgcv which has a mix of smooth and parametric terms.  The model is for some count data on fish catches.  I am modelling variation in location and time, but also differences among individual operators.  I am interested in the differences in operators specifically, and so am treating them as a fixed effect in my model.  While there are solutions for my problem outlined below in a glm context, I need to work with a gam as I have a 2 dimensional surface that is a key feature of the model.

My challenge is that some levels of the fixed effect for vessels always have responses of 0, leading them to have coefficient estimates that move to negative infinity, due to the log link. This leads to convergence failures.

I am looking for ways to solve this issue.  One obvious solution is to move to a Bayesian approach, and use weakly informative priors on the fixed effects terms.  I have implemented this in brms, using the same structure.  This works, but the complexity of the models means that the fitting time is much longer and I have a number of data sets to analyze.

I have found a few related questions:
https://stats.stackexchange.com/questions/504542/dealing-with-quasi-complete-separation-in-general-additive-model?r=SearchResults&s=1%7C34.9004<https://stats.stackexchange.com/questions/504542/dealing-with-quasi-complete-separation-in-general-additive-model?r=SearchResults&s=1%7C34.9004>
https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression<https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression>
but the first which dealt with gams wasn't resolved, and the second didn't appear to have any viable solutions that I haven't tried.

I would like to find a way to address this using gams in mgcv.  It seems like there should be two possibilities, using penalties of some sort on the fixed effect term or imposing a constraint on the fixed effect term.  After a lot of reading, I have not been able to find a good example of a penalty approach, and I am struggling to operationalize the pcls example from the mgcv help.

The help for pcls in mgcv is https://rdrr.io/cran/mgcv/man/pcls.html<https://rdrr.io/cran/mgcv/man/pcls.html> and the first example is the closest to what we are trying to do.

I have included an example dataset to illustrate the problem, any suggestions are greatly appreciated.

#set up data
Lat <- runif(100,1,20)
Lon <- runif(100,1,20)
Year <- runif(100,1,10)
Vessel <- rep((1:5)/10,each = 20)
Lambda <- exp(0.1*Lat - 0.01*Lat^2 + 0.1*Lon + 0.01*Lon^2 + 0.05*Year + Vessel)
Vessel <- as.factor(Vessel)
Catch <- apply(matrix(Lambda),1,rpois,n=1)
dt <- data.frame(Lat,Lon,Year,Vessel,Catch)

#now fit gam
M <- gam(Catch ~ s(Lat,Lon) + s(Year) + Vessel, family = tw, data = dt)

#create separation, all observations zero for one vessel
dt$Catch[dt$Vessel == levels(dt$Vessel)[2]] <- 0

#now fit gam again with separation
M <- gam(Catch ~ s(Lat,Lon) + s(Year) + Vessel, family = tw, data = dt)


Thanks,

Chris



Chris Wilcox
Minderoo Foundation * Flourishing Oceans
[cid:Minderoo-80w_fc252a32-0168-44a0-8b65-5f9c7e67697f.png]
M       +61 439 071 210
P       +61 8 6460 4949
L       Hobart (GMT+10)
E       cwilcox using minderoo.org
W       minderoo.org<https://www.minderoo.org/>

Please consider the environment before printing this email.
This email and any attachments may contain confidential information and may be subject to legal professional privilege. This email and attachments are intended only for the named addressee. If you are not the named addressee your access to this material is not authorised and you should not disseminate, distribute or copy this e-mail or use the information contained in it. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. The sender does not guarantee that this e-mail or any attachments are secure, error-free or free from viruses. The sender therefore does not accept liability for any loss or damage resulting (either directly or indirectly) from any such error or virus. The content and opinions set out in this email and any attachments are not necessarily those of Minderoo Foundation Limited, Tattarang Pty Ltd, their respective related entities, nor any person or entity who is or becomes a director, company secretary, officer, member, manager, employee or contractor (whether directly or indirectly) or related entity of the previously specified entities. This email and any attachments are also subject to copyright and may not be reproduced without permission.


More information about the R-help mailing list