[R] understanding the output from gls

Wed Sep 2 02:53:33 CEST 2009

Hi Tim,

On Tue, Sep 1, 2009 at 2:00 PM, <Timothy_Handley at nps.gov> wrote:
>
> I'd like to compare two models which were fitted using gls, however I'm
> having trouble interpreting the results of gls. If any of you could offer
> me some advice, I'd greatly appreciate it.
>
> Short explanation of models: These two models have the same fixed-effects
> structure (two independent, linear effects),

Known to be independent?

> and differ only in that the
> second model includes a corExp structure for spatial autocorrelation. (more
> detailed explanation of the models at end).
>
> Specific questions:
>
> 1. The second model estimates two additional parameters in the process of
> fitting the corSpatial object - the range and nugget of the spatial
> autocorrelation. Based on this, I would expect the second model to have two
> fewer residual degrees of freedom. However, the summary function reports
> that both models have the same number of residual degrees of freedom.  Why
> is this? (Interestingly, the difference in AIC between the two models
> reflects this difference in the number of model parameters)
>

That's a good question.  It does seem degrees of freedom would be
spent in estimating the spatial parameters.

The reason the functions using logLik get the number of parameters
"right" is because logLik.gls includes:

attr(val, "df") <- p + length(coef(object[["modelStruct"]])) + 1

where modelStruct holds the estimated parameters that structure
residual variance and correlation, and I believe p is the number of
columns in the design matrix, but I couldn't easily follow the code
within gls that produces it.

If nobody offers an explanation for the current result from
print.summary.gls you could ask on r-devel if the results are
intended.

> 2. In the model summary, what is the meaning of the small correlation
> matrix under the heading "Correlation:"? At first, I thought that this was
> describing possible correlations among the predictor variables, but then I
> saw that it also included the model intercept. What do these correlation
> value mean?

The GLS covariance of estimated fixed effects (including the
intercept) is (X' \hat{\Sigma}^-1 X)^-1, where X is the model matrix
and \Sigma is the response/error covariance matrix.  This will
generally have non-zero off-diagonals, implying the fixed effects
estimates are correlated.  The values you're inquiring about are the
scaled estimates of those off-diagonals.

hth,

Kingsford Jones

>
> ##More detailed information
> ##function calls:
>  sppl.i.xx = gls(all.all.rch~l10area+newx, data = gtemp, method="ML")
>  sppl.i.ex = gls(all.all.rch~l10area+newx, data = gtemp, method="ML",
>              correlation = corExp(c(20,.8), form=~x+y|area, nugget=TRUE))
>
> ##model summaries
>
>> summary(sppl.i.xx)
> Generalized least squares fit by maximum likelihood
>  Model: all.all.rch ~ l10area + newx
>  Data: gtemp
>       AIC     BIC    logLik
>  567.4893 578.572 -279.7446
>
> Coefficients:
>               Value Std.Error   t-value p-value
> (Intercept) 6.891867 0.3295097 20.915522   0e+00
> l10area     6.586182 0.3048870 21.602046   0e+00
> newx        0.047901 0.0117281  4.084307   1e-04
>
>  Correlation:
>        (Intr) l10are
> l10area -0.364
> newx     0.577 -0.007
>
> Standardized residuals:
>        Min          Q1         Med          Q3         Max
> -3.34307266 -0.57949890 -0.07214605  0.64309760  2.66409931
>
> Residual standard error: 2.590313
> Degrees of freedom: 118 total; 115 residual
>
> summary(sppl.i.ex)
> Generalized least squares fit by maximum likelihood
>  Model: all.all.rch ~ l10area + newx
>  Data: gtemp
>      AIC      BIC    logLik
>  559.167 575.7911 -273.5835
>
> Correlation Structure: Exponential spatial correlation
>  Formula: ~x + y | area
>  Parameter estimate(s):
>     range     nugget
> 15.4448835  0.3741476
>
> Coefficients:
>               Value Std.Error   t-value p-value
> (Intercept) 7.621306 0.7648135  9.964921  0.0000
> l10area     6.400442 0.5588160 11.453576  0.0000
> newx        0.066535 0.0204417  3.254857  0.0015
>
>  Correlation:
>        (Intr) l10are
> l10area -0.592
> newx     0.358  0.014
>
> Standardized residuals:
>       Min         Q1        Med         Q3        Max
> -3.0035983 -0.5990432 -0.2226852  0.5113270  2.4444263
>
> Residual standard error: 2.820337
> Degrees of freedom: 118 total; 115 residual
>
>
>
>
> Tim Handley
> Fire Effects Monitor
> Santa Monica Mountains National Recreation Area
> 401 W. Hillcrest Dr.
> Thousand Oaks, CA 91360
> 805-370-2347
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>