[R] Assessing temporal correlation in GAM with irregular time steps

Worthington, Thomas A thomas.worthington at okstate.edu
Wed Sep 4 00:10:05 CEST 2013


Dear Gavin 

Thank you for the very detailed response. I had started to go down the route of fitting a correlation structure via gamm. 

I tried applying your code to my data but returned the error 
"Error in corCAR1(~ID | SiteCode1971) : parameter in CAR(1) structure must be between 0 and 1" 

I set the 'bar' in your code to the sample ID (basically a number between 1 and 192) but I wasn't sure if this was what you meant in relation to 'ordering of the samples'

Best wishes
Tom   

-----Original Message-----
From: Gavin Simpson [mailto:ucfagls at gmail.com] 
Sent: Tuesday, September 03, 2013 3:17 PM
To: Worthington, Thomas A
Cc: r-help at r-project.org
Subject: Re: [R] Assessing temporal correlation in GAM with irregular time steps

It is possible, but you can't use the discrete time or classical stochastic trend models (or evaluate using the ACF). Also, why do you care to do this with regard to DoY? The assumption of the model relates to the residuals, so you should check those for residual autocorrelation.

As you are using `mgcv::gam` you could also use `mgcv::gamm` which can then leverage the correlation structures from the nlme package, which has spatial correlation structures (and you can think of time as a 1-d spatial direction). The package also has a `corCAR1()` correlation structure which is the continuous-time analogue of the AR(1). Fitting via `gamm()` will also allow you to use the `Variogram()` function from the nlme package to assess the model residuals for residual autocorrelation.

For example you could compare the two fits

m0 <- gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = "REML")
m1 <- gamm(Length ~ s(DOY, by = SiteCode) + SiteCode, data = foo, method = "REML",
                    correlation = corCAR1( ~ bar | SiteCode))

where `foo` is the object that contains the variables mentioned in the call, and `bar` is the variable (in `foo)` that indicates the ordering of the samples. Notice that I nest the CAR(1) within the two respective Sites, but do note IIRC that this fits the same residual correlation structure to both sites' residuals (i.e. there is 1 CAR(1) process, not two separate ones).

require(nlme)
anova(m0$lme, m1$lme)

will perform a likelihood ratio test on the two models.

If you have residual autocorrelation, do note that the smooth for DoY may be chosen to be more complex than is appropriate (it might be fitting the autocorrleated noise), so you may want to fix the degrees of freedom for the smoother at some a priori chosen value and use this same value when fitting both m0 and m1, or at the very least set an upper limit on the complexity of the DoY smooth, say via s(DoY, by = SiteCode, k = 5).

Finally, as a length <= 0 insect makes no sense, the assumption of Gaussian (Normal) errors may be in trouble with your data; apart from their strictly positive nature, the mean-variance relationship of the data may not follow that of the assumptions for the errors. You can move to a GLM (GAM) to account for this but things get very tricky with the correlation structures (you can use gamm() still but fitting then goes via glmmPQL() in the MASS package a thence to lme()).

If you just want to fit a variogram to something, there are a large number of spatial packages available for R, several of which can fit variograms to data, though you will need to study their respective help files for how to use them. As for the input data, often the time/date of sampling encoded as a numeric will be sufficient input, but you will need to check individual functions for what they require.
I would check out the Spatial Task View on CRAN.

HTH

G

On 28 August 2013 14:26, Worthington, Thomas A <thomas.worthington at okstate.edu> wrote:
> I have constructed a GAM using the package mgcv to test whether the 
> lengths of an emerging insect (Length) varies with day of the year 
> (DOY) and between two sites (SiteCode). The data are collected at 
> irregular time steps ranging from 2 days to 20 days between samples. 
> The GAM takes the form
>
> M3 <- gam(Length ~s(DOY, by = SiteCode) + SiteCode)
>
> As the data are a time series I would like to test for temporal autocorrelation. I have read that it is not possible to use the autocorrelation function (ACF) due to the irregular spacing and that producing a variogram in relation to DOY would be an option.
>
> Is this a correct method to test for temporal autocorrelation?
>
> And could someone suggest the code to produce the variogram as I'm 
> getting an error related to the 'distance' argument
>
> Best wishes
> Tom
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Gavin Simpson, PhD


More information about the R-help mailing list