[R] mgcv: relative risk from GAM with distributed lag

j@de@shod@@ m@iii@g oii googiem@ii@com j@de@shod@@ m@iii@g oii googiem@ii@com
Tue Jul 19 17:07:42 CEST 2022


Dear list members,

Does anyone know how to obtain a relative risk/ risk ratio from a GAM
with a distributed lag model implemented in mgcv? I have a GAM
predicting daily deaths from time series data consisting of daily
temperature, humidity and rainfall. The GAM includes a distributed lag
model because deaths may occur over several days following a high heat
day.

What I'd like to do is compute (and plot) the relative risk
(accumulated across all lags) for a given temperature vs the
temperature at which the risk is lowest, with corresponding confidence
intervals. I am aware of the predict.gam function but am not sure if
and how it should be used in this case. (Additionally, I'd also like
to plot the relative risk for different lags separately).

I apologise if this seems trivial to some. (Actually, I hope it is,
because that might mean I get a solution!) I've been looking for
examples on how to do this, but found nothing so far. Suggestions
would be very much appreciated!

Below is a reproducible example with the GAM:

library(mgcv)
set.seed(3) # make reproducible example
simdat <- gamSim(1,400) # simulate data
g <- exp(simdat$f/5)
simdat$y <- rnbinom(g,size=3,mu=g)  # negative binomial response var
simdat$time <- 1:400  # create time series
names(simdat) <- c("deaths", "temp", "humidity", "rain", "x3", "f",
"f0", "f1", "f2", "f3", "time")

# lag function based on Simon Wood (book 2017, p.349 and gamair
package documentation p.54
# https://cran.rstudio.com/web/packages/gamair/gamair.pdf)
lagard <- function(x,n.lag=7) {
n <- length(x); X <- matrix(NA,n,n.lag)
for (i in 1:n.lag) X[i:n,i] <- x[i:n-i+1]
X
}

# set up lag, temp, rain and humidity as 7-column matrices
# to create lagged variables - based on Simon Wood's example
dat <- list(lag=matrix(0:6,nrow(simdat),7,byrow=TRUE),
deaths=simdat$deaths, time = simdat$time)
dat$temp <- lagard(simdat$temp)
dat$rain <- lagard(simdat$rain)
dat$humidity <- lagard(simdat$humidity)

mod <- gam(deaths~s(time, k=70) + te(temp, lag, k=c(12, 4)) +
te(humidity, lag, k=c(12, 4)) + te(rain, lag, k=c(12, 4)), data = dat,
family = nb, method = 'REML', select = TRUE)

summary(mod)
plot(mod, scheme = 1)
plot(mod, scheme = 2)

Thanks for any suggestions you may have,

Jade



More information about the R-help mailing list