[R] Assessing temporal correlation in GAM with irregular time steps

Gavin Simpson ucfagls at gmail.com
Tue Sep 3 21:16:36 CEST 2013


It is possible, but you can't use the discrete time or classical
stochastic trend models (or evaluate using the ACF). Also, why do you
care to do this with regard to DoY? The assumption of the model
relates to the residuals, so you should check those for residual
autocorrelation.

As you are using `mgcv::gam` you could also use `mgcv::gamm` which can
then leverage the correlation structures from the nlme package, which
has spatial correlation structures (and you can think of time as a 1-d
spatial direction). The package also has a `corCAR1()` correlation
structure which is the continuous-time analogue of the AR(1). Fitting
via `gamm()` will also allow you to use the `Variogram()` function
from the nlme package to assess the model residuals for residual
autocorrelation.

For example you could compare the two fits

m0 <- gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo,
method = "REML")
m1 <- gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo,
method = "REML",
                    correlation = corCAR1( ~ bar | SiteCode))

where `foo` is the object that contains the variables mentioned in the
call, and `bar` is the variable (in `foo)` that indicates the ordering
of the samples. Notice that I nest the CAR(1) within the two
respective Sites, but do note IIRC that this fits the same residual
correlation structure to both sites' residuals (i.e. there is 1 CAR(1)
process, not two separate ones).

require(nlme)
anova(m0$lme, m1$lme)

will perform a likelihood ratio test on the two models.

If you have residual autocorrelation, do note that the smooth for DoY
may be chosen to be more complex than is appropriate (it might be
fitting the autocorrleated noise), so you may want to fix the degrees
of freedom for the smoother at some a priori chosen value and use this
same value when fitting both m0 and m1, or at the very least set an
upper limit on the complexity of the DoY smooth, say via s(DoY, by =
SiteCode, k = 5).

Finally, as a length <= 0 insect makes no sense, the assumption of
Gaussian (Normal) errors may be in trouble with your data; apart from
their strictly positive nature, the mean-variance relationship of the
data may not follow that of the assumptions for the errors. You can
move to a GLM (GAM) to account for this but things get very tricky
with the correlation structures (you can use gamm() still but fitting
then goes via glmmPQL() in the MASS package a thence to lme()).

If you just want to fit a variogram to something, there are a large
number of spatial packages available for R, several of which can fit
variograms to data, though you will need to study their respective
help files for how to use them. As for the input data, often the
time/date of sampling encoded as a numeric will be sufficient input,
but you will need to check individual functions for what they require.
I would check out the Spatial Task View on CRAN.

HTH

G

On 28 August 2013 14:26, Worthington, Thomas A
<thomas.worthington at okstate.edu> wrote:
> I have constructed a GAM using the package mgcv to test whether the lengths of an emerging insect (Length) varies with day of the year (DOY) and between two sites (SiteCode). The data are collected at irregular time steps ranging from 2 days to 20 days between samples. The GAM takes the form
>
> M3 <- gam(Length ~s(DOY, by = SiteCode) + SiteCode)
>
> As the data are a time series I would like to test for temporal autocorrelation. I have read that it is not possible to use the autocorrelation function (ACF) due to the irregular spacing and that producing a variogram in relation to DOY would be an option.
>
> Is this a correct method to test for temporal autocorrelation?
>
> And could someone suggest the code to produce the variogram as I'm getting an error related to the 'distance' argument
>
> Best wishes
> Tom
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gavin Simpson, PhD



More information about the R-help mailing list