[R] lme, lmer, gls, and spatial autocorrelation

Tue Aug 25 18:11:44 CEST 2009

Manuel,

Thanks for the reference. I printed it out and read through it this
morning.

I think I'm going to take a gls approach. I've spent the last couple weeks
reading about spatial autocorrelation, and found that the world of SAC is
large, complex, and requires more time than I currently have. Using gls
seems a reasonable compromise between statistical rigour, and the
unfortunate but real constraint of my limited time to work on this project.
According to Dorman et al, in their (admittedly limited) tests, GLS worked
reasonably well with Poisson distributed synthetic data.

Also, I've come to think that the ability to do model comparison would be
useful. While I would like to be able to confidently choose a model for
spatial autocorrelation a priori, based on biological knowledge, I don't
have enough information to do this. Even after some data exploration, using
variograms and plots of Moran's I, it still seems like there's insufficient
information. Using a fitness score such as AIC, I could compare a small
number of reasonable models to find the most appropriate error structure.
Additionally, I could compare the SAC-informed and SAC-ignorant models to
get a holistic assessment of the importance of SAC in my data.

Tim Handley
Fire Effects Monitor
Santa Monica Mountains National Recreation Area
401 W. Hillcrest Dr.
Thousand Oaks, CA 91360
805-370-2347

             Manuel Morales                                                
             <Manuel.A.Morales                                             
             @williams.edu>                                             To 
                                       Timothy_Handley at nps.gov             
             08/24/2009 05:31                                           cc 
             PM                        Bert Gunter                         
                                       <gunter.berton at gene.com>,           
                                       r-help at r-project.org                
                                                                   Subject 
                                       Re: [R] lme, lmer, gls, and spatial 
                                       autocorrelation                     

Hi Tim,

I don't believe there is a satisfactory solution in R - at least yet -
for non-normal models. Ultimately, this should be possible using lmer()
but not in the near-term. One possibility is to use glmPQL as described
in:

Dormann, F. C., McPherson, J. M., Araújo, M. B., Bivand, R., Bolliger,
J., Carl, G., Davies, R. G., Hirzel, A., Jetz, W., Kissling, W. D.,
Kühn, I., Ohlemüller, R., Peres-Neto, P. R., Reineking, B., Schröder,
B., Schurr, F. M. and Wilson, R. 2007. Methods to account for spatial
autocorrelation in the analysis of species distributional data: a
review. – Ecography 30: 609–628.

However, note the caution:

"This is an inofficial abuse of a Generalized Linear Mixed Model
function (glmmPQL {MASS}), which is a wrapper function for lme {nlme},
which in turn internally calls gls {nlme}."

If all you need are parameter estimates, fine. If you want to do model
comparison, though, no luck.

Manuel

On Mon, 2009-08-24 at 12:10 -0700, Timothy_Handley at nps.gov wrote:
> Bert -
>
>  I took a look at that page just now, and I'd classify my problem as
> spatial regression. Unfortunately, I don't think the spdep library fits
my
> needs. Or at least, I can't figure out how to use it for this problem.
The
> examples I have seen all use spdep with networks. They build a graph,
> connecting each location to something like the nearest N neighbors,
attach
> some set of weights, and then do an analysis. The plots in my data have a
> very irregular, semi-random, yet somewhat clumped (several isolated
> islands), spatial distribution. Honestly, it's quite weird looking. I
don't
> know how to cleanly turn this into a network, and even if I did, I don't
> know that I ought to. To me (and please feel free to disagree) it seems
> more natural to use a matrix of distances and associated correlations,
> which is what the gls function appears to do.
>
> In the ecological analysis section, it looks like both 'ade4' and 'vegan'
> may have helpful tools. I'll explore that some more. However, I still
think
> that one of lme or gls already has the functionality I need, and I just
> need to learn how to use them properly.
>
> Tim Handley
> Fire Effects Monitor
> Santa Monica Mountains National Recreation Area
> 401 W. Hillcrest Dr.
> Thousand Oaks, CA 91360
> 805-370-2347
>
>
>

>              Bert Gunter

>              <gunter.berton at ge

>              ne.com>
To
>                                        <Timothy_Handley at nps.gov>,

>              08/24/2009 11:43          <r-help at r-project.org>

>              AM
cc
>

>
Subject
>                                        RE: [R] lme, lmer, gls, and
spatial
>                                        autocorrelation

>

>

>

>

>

>

>
>
>
>
> Have you looked at the "Spatial" task view on CRAN? That would seem to me
> the logical first place to go.
>
> Bert Gunter
> Genentech Nonclinical Biostatisics
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On
> Behalf Of Timothy_Handley at nps.gov
> Sent: Monday, August 24, 2009 11:12 AM
> To: r-help at r-project.org
> Subject: [R] lme, lmer, gls, and spatial autocorrelation
>
>
> Hello folks,
>
> I have some data where spatial autocorrelation seems to be a serious
> problem, and I'm unclear on how to deal with it in R. I've tried to do my
> homework - read through 'The R Book,' use the online help in R, search
the
> internet, etc. - and I still have some unanswered questions. I'd greatly
> appreciate any help you could offer. The super-super short explanation is
> that I'd like to draw a straight line through my data, accounting for
> spatial autocorrelation and using Poisson errors (I have count data).
> There's a longer explanation at the end of this e-mail, I just didn't
want
> to overdo it at the start.
>
> There are three R functions that do at least some of what I would like,
but
> I'm unclear on some of their specifics.
>
> 1. lme - Maybe models spatial autocorrelation, but doesn't allow for
> Poisson errors. I get mixed messages from The R Book. On p. 647, there's
an
> example that uses lme with temporal autocorrelation, so it seems that you
> can specify a correlation structure. On the other hand, on p.778, The R
> Book says, "the great advantage of the gls function is that the errors
are
> allowed to be correlated". This suggests that only gls (not lme or lmer)
> allows specification of a corStruct class. Though it may also suggest
that
> I have an incomplete understanding of these functions.
>
> 2. lmer - Allows specification of a Poisson error structure. However, it
> seems that lmer does not yet handle correlated errors.
>
> 3. gls - Surely works with spatial autocorrelation, but doesn't allow for
> Poisson errors. Does allow the spatial autocorrelation to be assessed
> independently for different groups (I have two groups, one at each of two
> different spatial scales).
>
> Since gls is what The R Book uses in the example of spatial
> autocorrelation, this seems like the best option. I'd rather have Poisson
> errors, but Gaussian would be OK. However, I'm still somewhat confused by
> these three functions. In particular, I'm unclear on the difference
between
> lme and gls. I'd feel more confident in my results if I had a better
> understanding of these choices. I'd greatly appreciate advice on the
matter
>
>
> More detailed explanation of the data/problem is below:
>
> The data:
> [1] A count of the number of plant species present on each of 96 plots
that
> are 1m^2 in area.
> [2] A count of the number of plant species present on each of 24 plots
that
> are 100m^2 in area.
> [3] X,Y coordinates for the centroid of all plots (both sizes).
>
> Goal:
> 1. A best fit straight-line relating log10(area) to #species.
> 2. The slope of that line, and the standard error of that slope. (I want
to
> compare the slope of this line with the slope of another line)
>
> The problem:
> Spatial autocorrelation. Across our range of plot-separation-distances,
> Moran's I ranges from -.5 to +.25. Depending on the size of the
> distance-bins, about 1 out of 10 of these I values are statistically
> significant. Thus, there seems to be a significant degree of spatial
> autocorrelation. if I want 'good' values for my line parameters, I need
to
> account for this somehow.
>
>
> Tim Handley
> Fire Effects Monitor
> Santa Monica Mountains National Recreation Area
> 401 W. Hillcrest Dr.
> Thousand Oaks, CA 91360
> 805-370-2347
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
http://mutualism.williams.edu