[R] Problem with Variance Components (and general glmmconfusion)

Toby Gardner t.gardner at uea.ac.uk
Thu Sep 7 16:22:22 CEST 2006


Dear Dr Bates,



Many thanks for such a useful response to my problem.



Regarding Variance Components .....



The VarCorr function runs fine for lmer objects once the nlme package is 
removed.



Regarding the format of the nested random effects for an lmer object, you 
said:



In recent versions of lme4 you can use the specification

model2 <- lmer(y ~ 1 + (1|groupA/groupB))

Your version may be correct or not.  It depends on what the distinct
levels of groupB correspond to.  The version with the / is more
reliable.



This works well. These are environmental data measured at plots within sites 
(B), within forests (A).



Here is the model (I have put a dump of the data file used for these 
analyses at the end of this email):



> modelusd<-lmer(USD~1 + (1|forest/site))

> summary(modelusd)

Linear mixed-effects model fit by REML

Formula: USD ~ 1 + (1 | forest/site)

      AIC      BIC    logLik MLdeviance REMLdeviance

 816.7469 825.7788 -405.3734   815.0236     810.7469

Random effects:

 Groups      Name        Variance Std.Dev.

 site:forest (Intercept)  6.2099  2.4920

 forest      (Intercept) 33.0435  5.7483

 Residual                10.4335  3.2301

number of obs: 150, groups: site:forest, 15; forest, 3



Fixed effects:

            Estimate Std. Error t value

(Intercept)   9.8033     3.3909  2.8911



And VarCorr confirms the variance components:



> VarCorr(modelusd)

$`site:forest`

1 x 1 Matrix of class "dpoMatrix"

            (Intercept)

(Intercept)    6.209851



$forest

1 x 1 Matrix of class "dpoMatrix"

            (Intercept)

(Intercept)    33.04345



attr(,"sc")

[1] 3.230100



And following your suggestion I used the HPDinterval to obtain a measure of 
error around the random effects:



> MC.modelusd<-mcmcsamp(modelusd, 50000)

> HPDinterval(MC.modelusd)

                     lower      upper

(Intercept)     -3.5815626  23.202750

log(sigma^2)     2.1168335   2.594976

log(st:f.(In))   0.8209994   3.250159

log(frst.(In))   1.0676778   8.676852

deviance       814.8747050 829.055630

attr(,"Probability")

[1] 0.95



What I am really after are the intra-class correlation coefficients so I can 
demonstrate the variability in a given environmental variable at different 
spatial scales.  I can of course calculate the % variance explained for each 
random effect from the summary(lmer).  However - and this may be a stupid 
question! - but can the intervals for the StDev of the random effects also 
just be transformed to intervals of the variance (and then converted to % 
values for the intra-class correlation coefficients) by squaring?



Ideally I would like to partition the variance explained by all (three) 
spatially nested scales - forest / site / array - where array is the sample 
unit.  Using lmer produces the model summary I want:



> modelusd2<-lmer(USD~1 + (1|forest/site/array))

> summary(modelusd2)



Linear mixed-effects model fit by REML

Formula: USD ~ 1 + (1 | forest/site/array)

      AIC      BIC    logLik MLdeviance REMLdeviance

 818.7469 830.7894 -405.3734   815.0236     810.7469

Random effects:

 Groups              Name        Variance Std.Dev.

 array:(site:forest) (Intercept)  7.5559  2.7488

 site:forest         (Intercept)  6.2099  2.4920

 forest              (Intercept) 33.0435  5.7484

 Residual                         2.8776  1.6963

number of obs: 150, groups: array:(site:forest), 150; site:forest, 15; 
forest, 3



Fixed effects:

            Estimate Std. Error t value

(Intercept)   9.8033     3.3909  2.8911



However - the mcmcsamp process fails



> MC.modelusd2<-mcmcsamp(modelusd2, 50000)



with this error message:



Error: Leading minor of order 1 in downdated X'X is not positive definite

Error in t(.Call(mer_MCMCsamp, object, saveb, n, trans, verbose)) :

            unable to find the argument 'x' in selecting a method for 
function 't'

>



Am I trying something impossible here?



Regarding GLMMs..(now with species count data, blocking random factors and 
multiple fixed factors)



When using lmer I would suggest using method = "Laplace" and perhaps
control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
another reply to the list a few minutes ago.



This seems to work well, thanks.



With the greatest respect to all concerned, if I could I would like to echo 
the request by Martin Maechler on the list a few weeks ago that it would be 
extremely useful (especially for newcomers like me - and likely would 
greatly reduce the traffic on this list looking at many of the past threads) 
if authors of packages were able to be explicit in the help files about how 
functions differ (key advantages and disadvantages) from packages offering 
otherwise very similar functions (e.g. lmer/glmmML - although the subsequent 
comment by Dr Bates on this helped a lot).



Many thanks!



Toby Gardner



platform       i386-pc-mingw32
arch           i386
os             mingw32
system         i386, mingw32
status
major          2
minor          3.1
year           2006
month          06
day            01
svn rev        38247
language       R
version.string Version 2.3.1 (2006-06-01)



> dump("testdata", file=stdout())
testdata <-
structure(list(forest = structure(as.integer(c(2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Label = c("EUC",
"PF", "SF"), class = "factor"), site = as.integer(c(1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5
)), Array = as.integer(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)), USD = c(13.67,
13, 9.33, 10.67, 11, 10, 11.67, 10.33, 7, 8, 5.67, 7.67, 13.67,
10.33, 10.67, 7, 10.33, 12, 14, 7.67, 6.67, 4.33, 9, 13.67, 11,
19, 14, 19.67, 18.67, 8.33, 4.67, 10, 5, 11, 8, 7.67, 11, 12,
11, 7.67, 13.67, 15.33, 12, 11.33, 14.67, 13.33, 7, 12, 11.33,
11.33, 0.67, 3.33, 2, 2.67, 0.33, 1.33, 1.33, 1, 0.67, 0, 3.33,
3.33, 5.67, 4.67, 1.33, 3.67, 1, 6.33, 3, 1.67, 3.67, 5.67, 5.33,
2.67, 1.67, 2.33, 3.67, 6.67, 5.33, 9, 8, 5.67, 2.67, 0, 4.33,
8, 6, 3.67, 10.67, 9, 0.67, 1, 1.33, 0.67, 3, 1.5, 0.67, 0.33,
7.67, 7.33, 22, 18.67, 16.67, 21.33, 22.33, 22.33, 17.67, 16.33,
19.67, 20.67, 21.33, 17.33, 19, 19.33, 18.33, 18.67, 18.33, 19,
18.33, 19.33, 17.33, 12.33, 9.33, 8.33, 6, 17, 18, 8, 12, 15.67,
6, 1.33, 19, 11.67, 7, 16.33, 16, 14, 10.33, 4, 19, 19.67, 14,
15.33, 14, 6.33, 11.33, 11.67, 14, 15.33)), .Names = c("forest",
"site", "Array", "USD"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
"25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
"36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
"47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
"58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68",
"69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79",
"80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90",
"91", "92", "93", "94", "95", "96", "97", "98", "99", "100",
"101", "102", "103", "104", "105", "106", "107", "108", "109",
"110", "111", "112", "113", "114", "115", "116", "117", "118",
"119", "120", "121", "122", "123", "124", "125", "126", "127",
"128", "129", "130", "131", "132", "133", "134", "135", "136",
"137", "138", "139", "140", "141", "142", "143", "144", "145",
"146", "147", "148", "149", "150"))







----- Original Message ----- 
From: "Douglas Bates" <bates at stat.wisc.edu>
To: "Toby Gardner" <t.gardner at uea.ac.uk>
Cc: "R-help" <r-help at stat.math.ethz.ch>
Sent: Thursday, September 07, 2006 1:23 AM
Subject: Re: [R] Problem with Variance Components (and general 
glmmconfusion)


> On 9/4/06, Toby Gardner <t.gardner at uea.ac.uk> wrote:
>> Dear list,
>>
>> I am having some problems with extracting Variance Components from a 
>> random-effects model:
>>
>> I am running a simple random-effects model using lme:
>>
>> model<-lme(y~1,random=~1|groupA/groupB)
>>
>> which returns the output for the StdDev of the Random effects, and model 
>> AIC etc as expected.
>>
>> Until yesterday I was using R v. 2.0, and had no problem in calling the 
>> variance components of the above model using VarCorr(model), together 
>> with their 95% confidence intervals using intervals() - although for some 
>> response variables a call to intervals() returns the error: Cannot get 
>> confidence intervals on var-cov components: Non-positive definite 
>> approximate variance-covariance.
>>
>> I have now installed R v. 2.3.1 and am now experiencing odd behaviour 
>> with VarCorr(lme.object), with an error message typically being returned:
>>
>> Error in VarCorr(model) : no direct or inherited method for function 
>> 'VarCorr' for this call
>>
>> Is this known to happen? For instance could it be due to the subsequent 
>> loading of new packages? (lme4 for instance?).
>
> Yes.  Avoid loading lme4 and nlme simultaneously.
>
>>
>> To get around this problem I have tried running the same model using 
>> lmer:
>>
>> model2<-lmer(y~1 + (1|groupA) + (1|groupB))
>
> In recent versions of lme4 you can use the specification
>
> model2 <- lmer(y ~ 1 + (1|groupA/groupB))
>
> Your version may be correct or not.  It depends on what the distinct
> levels of groupB correspond to.  The version with the / is more
> reliable.
>
>>
>> Should this not produce the same model? The variance components are very 
>> similar but not identical, making me think that I am doing something 
>> wrong. I am also correct in thinking that intervals() does not work with 
>> lmer? I get: Error in intervals(model2) : no applicable method for 
>> "intervals"
>
> That is correct.  Currently there is no intervals method for an lmer
> model.  You can use mcmcsamp to get a Markov chain Monte Carlo sample
> to which you can apply HPDinterval from the "coda" package.  However,
> these are stochastic intervals so it is best to try on a couple of
> chains to check on the reproducibility or the intervals.
>
>>
>> GLMM
>>
>> I have a general application question - please excuse my ignorance, I am 
>> relatively new to this and trying to find a way through the maze.  In 
>> short I need to compile generalized linear mixed models both for (a) 
>> Poisson data and (b) binonial data incorporating a two nested random 
>> factors, and I need to be able to extract AIC values as I am taking an 
>> information-theoretic approach to model selection.  Prior to sending an 
>> email to the list I have spent quite a few days reading the background on 
>> a number of functions, all of which offer potential for this; glmmML, 
>> glmmPQL, lmer, and glmmADMB.  I can understand that glmmPQL is unsuitable 
>> because there is no way of knowing the maximised likelihood, but is there 
>> much difference between the remaining three options? I have seen 
>> simulation comparisons published on this list between glmmADMB and 
>> glmmPQL and lmer, but it seems these are before the latest release of 
>> lmer, and also they do not evaluate glmmML.  To a newcomer this myria!
> d !
>>  of options is bewildering, can anyone offer advice as to the most robust 
>> approach?
>
> Goran can correct me if I am wrong but I don't believe that glmmML can
> be used with multiple levels of random effects.
>
> I'm not sure what the status of glmmADMB is these days.  There was
> some controversy regarding the license applied to some of that code a
> while back.  I don't know if it has been resolved to everyone's
> satisfaction.
>
> When using lmer I would suggest using method = "Laplace" and perhaps
> control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
> another reply to the list a few minutes ago.
>
> Let us know how it works out.
>
>>
>> Many thanks for your time and patience,
>>
>> Toby Gardner
>>
>> School of Environmental Sciences
>> University of East Anglia
>> Norwich, NR4 7TJ
>> United Kingdom
>> Email: t.gardner at uea.ac.uk
>> Website: www.uea.ac.uk/~e387495
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list