[R] Problem with Variance Components (and general glmmconfusion)

Spencer Graves spencer.graves at pdf.com
Sun Sep 17 19:30:44 CEST 2006


      With minor modifications, I got your example to work.  After 
reading in your 'testdata', I then made 'site' and 'Array' factors: 

testdata$site <- factor(testdata$site)
testdata$Array <- factor(testdata$Array)

# Note:  S-Plus and R are case sensitive, so 'array' and 'Array' are two 
different things. 

      Then rather than using 'attach', I used the 'data' argument.  It's 
extra book keeping, but to me seems cleaner. 

library(lme4)
modelusd<-lmer(USD~1 + (1|forest/site), testdata)
MC.modelusd<-mcmcsamp(modelusd, 50000)
modelusd2<-lmer(USD~1 + (1|forest/site/Array), data=testdata)
MC.modelusd2<-mcmcsamp(modelusd2, 50000)

      These all worked fine for me;  see 'sessionInfo()' below. 

      However, the second model is nonsense, because you have only one 
level of Array within each forest/site combination.  Thus, Array is 
completely confounded with the residual.  The variance decomposition for 
modelusd and modelusd2 are identical, except that the Residual Variance 
in the first is the sum of the Variances for Array:(site:forest) in the 
second.  From HPDinterval, I got the following for the two: 

 > HPDinterval(MC.modelusd)
                     lower      upper
(Intercept)     -4.6265645  24.168019
log(sigma^2)     2.1184607   2.595467
log(st:f.(In))   0.8987962   2.901460
log(frst.(In))   1.7248681   6.889331
deviance       814.9391134 827.032135
attr(,"Probability")
[1] 0.95
 > HPDinterval(MC.modelusd2)
                     lower      upper
(Intercept)     -4.6414218  23.653071
log(sigma^2)   -16.4372306   2.666573
log(A:(:.(In))  -4.6028284   2.744530
log(st:f.(In))   0.8660924   2.987095
log(frst.(In))   1.4793119   6.981519
deviance       814.8988318 827.634986
attr(,"Probability")
[1] 0.95
 >
      The intervals are similar for site and forest between the two 
models.  However, the interval for log(sigma^2) is not terribly wide for 
the first but is essentially unbounded below in the second for both 
log(sigma^2) and log(Variance(Array)):  The lower limit for sigma^2 = 
exp(-16.44) = 7e-8 and for var(Array) = exp(-4.6) = 0.01. 

      Regarding intraclass correlation, it's not clear to me what 
definition you are using.  (Dave Howell notes that there are many 
different definitions;  
"www.uvm.edu/~dhowell/StatPages/More_Stuff/icc/icc.html".)   Whatever 
definition you are using, however, there is no easy way to combine 
multiple intervals into one with a transformation.  Before MCMC, it was 
common to use a first order Taylor approximation.  That procedure was 
great when there was nothing else sensible available, but could be 
highly misleading if the original intervals were not symmetric or the 
linear approximation not very good over  most of the intervals of 
interest.  Fortunately, with MCMC, it's fairly easy.  The documentation 
says that the output of  'mcmcsamp' is an object of class 'mcmc'.  Using 
'str', I find that it looks like it might be a numeric array, which is 
confirmed as follows: 

 > is.array(MC.modelusd)
[1] TRUE

      Therefore, I will try to just add columns to that array as follows: 

MC.modelusd. <- cbind(MC.modelusd,
         sigma2=exp(MC.modelusd[, "log(sigma^2)"]),
         site=exp(MC.modelusd[, "log(st:f.(In))"]),
         forest=exp(MC.modelusd[, "log(frst.(In))"] ) )
MC.modelusd.a <- cbind(MC.modelusd.,
     intraclass.site=MC.modelusd.[, "site"] / (
       MC.modelusd.[, "site"]+MC.modelusd.[, "sigma2"]) )

      When I tried 'HPDinterval(MC.modelusd.a) at this point, I got an 
error message.  Therefore, I changed the class and tried again: 

class(MC.modelusd.a) <- class(MC.modelusd)
HPDinterval(MC.modelusd.a)
                        lower       upper
(Intercept)      -4.626564547  24.1680185
log(sigma^2)      2.118460743   2.5954665
log(st:f.(In))    0.898796162   2.9014600
log(frst.(In))    1.724868091   6.8893307
deviance        814.939113359 827.0321354
sigma2            8.198757903  13.2489330
site              1.627992273  15.8475869
forest            0.003788248 650.2006949
intraclass.site   0.173906803   0.6321633
attr(,"Probability")
[1] 0.95
 
      If you follow this example, I believe you should be able to get 
other desired "highest posterior density (HPD)" intervals. 

      Hope this helps. 
      Spencer Graves
 > sessionInfo()
Version 2.3.1 Patched (2006-08-13 r38872)
i386-pc-mingw32

attached base packages:
[1] "methods"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
[7] "base"    

other attached packages:
      coda       lme4     Matrix    lattice
  "0.10-6"  "0.995-2" "0.995-16"  "0.13-10"

Toby Gardner wrote:

<snip>
> What I am really after are the intra-class correlation coefficients so I can 
> demonstrate the variability in a given environmental variable at different 
> spatial scales.  I can of course calculate the % variance explained for each 
> random effect from the summary(lmer).  However - and this may be a stupid 
> question! - but can the intervals for the StDev of the random effects also 
> just be transformed to intervals of the variance (and then converted to % 
> values for the intra-class correlation coefficients) by squaring?
>   
SG:  NO, there is no easy transformation of the intervals. 
> Ideally I would like to partition the variance explained by all (three) 
> spatially nested scales - forest / site / array - where array is the sample 
> unit.  Using lmer produces the model summary I want:
>
>   
>> modelusd2<-lmer(USD~1 + (1|forest/site/array))
>>     
>
>   
>> summary(modelusd2)
>>     
>
>
>
> Linear mixed-effects model fit by REML
>
> Formula: USD ~ 1 + (1 | forest/site/array)
>
>       AIC      BIC    logLik MLdeviance REMLdeviance
>
>  818.7469 830.7894 -405.3734   815.0236     810.7469
>
> Random effects:
>
>  Groups              Name        Variance Std.Dev.
>
>  array:(site:forest) (Intercept)  7.5559  2.7488
>
>  site:forest         (Intercept)  6.2099  2.4920
>
>  forest              (Intercept) 33.0435  5.7484
>
>  Residual                         2.8776  1.6963
>
> number of obs: 150, groups: array:(site:forest), 150; site:forest, 15; 
> forest, 3
>
>
>
> Fixed effects:
>
>             Estimate Std. Error t value
>
> (Intercept)   9.8033     3.3909  2.8911
>
>
>
> However - the mcmcsamp process fails
>
>
>
>   
>> MC.modelusd2<-mcmcsamp(modelusd2, 50000)
>>     
>
>
>
> with this error message:
>
>
>
> Error: Leading minor of order 1 in downdated X'X is not positive definite
>
> Error in t(.Call(mer_MCMCsamp, object, saveb, n, trans, verbose)) :
>
>             unable to find the argument 'x' in selecting a method for 
> function 't'
>
>   
>
>
>
> Am I trying something impossible here?
>
>
>
> Regarding GLMMs..(now with species count data, blocking random factors and 
> multiple fixed factors)
>
>
>
> When using lmer I would suggest using method = "Laplace" and perhaps
> control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
> another reply to the list a few minutes ago.
>
>
>
> This seems to work well, thanks.
>
>
>
> With the greatest respect to all concerned, if I could I would like to echo 
> the request by Martin Maechler on the list a few weeks ago that it would be 
> extremely useful (especially for newcomers like me - and likely would 
> greatly reduce the traffic on this list looking at many of the past threads) 
> if authors of packages were able to be explicit in the help files about how 
> functions differ (key advantages and disadvantages) from packages offering 
> otherwise very similar functions (e.g. lmer/glmmML - although the subsequent 
> comment by Dr Bates on this helped a lot).
>
>
>
> Many thanks!
>
>
>
> Toby Gardner
>
>
>
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status
> major          2
> minor          3.1
> year           2006
> month          06
> day            01
> svn rev        38247
> language       R
> version.string Version 2.3.1 (2006-06-01)
>
>
>
>   
>> dump("testdata", file=stdout())
>>     
> testdata <-
> structure(list(forest = structure(as.integer(c(2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), .Label = c("EUC",
> "PF", "SF"), class = "factor"), site = as.integer(c(1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
> 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5,
> 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
> 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5
> )), Array = as.integer(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2,
> 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)), USD = c(13.67,
> 13, 9.33, 10.67, 11, 10, 11.67, 10.33, 7, 8, 5.67, 7.67, 13.67,
> 10.33, 10.67, 7, 10.33, 12, 14, 7.67, 6.67, 4.33, 9, 13.67, 11,
> 19, 14, 19.67, 18.67, 8.33, 4.67, 10, 5, 11, 8, 7.67, 11, 12,
> 11, 7.67, 13.67, 15.33, 12, 11.33, 14.67, 13.33, 7, 12, 11.33,
> 11.33, 0.67, 3.33, 2, 2.67, 0.33, 1.33, 1.33, 1, 0.67, 0, 3.33,
> 3.33, 5.67, 4.67, 1.33, 3.67, 1, 6.33, 3, 1.67, 3.67, 5.67, 5.33,
> 2.67, 1.67, 2.33, 3.67, 6.67, 5.33, 9, 8, 5.67, 2.67, 0, 4.33,
> 8, 6, 3.67, 10.67, 9, 0.67, 1, 1.33, 0.67, 3, 1.5, 0.67, 0.33,
> 7.67, 7.33, 22, 18.67, 16.67, 21.33, 22.33, 22.33, 17.67, 16.33,
> 19.67, 20.67, 21.33, 17.33, 19, 19.33, 18.33, 18.67, 18.33, 19,
> 18.33, 19.33, 17.33, 12.33, 9.33, 8.33, 6, 17, 18, 8, 12, 15.67,
> 6, 1.33, 19, 11.67, 7, 16.33, 16, 14, 10.33, 4, 19, 19.67, 14,
> 15.33, 14, 6.33, 11.33, 11.67, 14, 15.33)), .Names = c("forest",
> "site", "Array", "USD"), class = "data.frame", row.names = c("1",
> "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
> "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24",
> "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
> "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46",
> "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57",
> "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68",
> "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79",
> "80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90",
> "91", "92", "93", "94", "95", "96", "97", "98", "99", "100",
> "101", "102", "103", "104", "105", "106", "107", "108", "109",
> "110", "111", "112", "113", "114", "115", "116", "117", "118",
> "119", "120", "121", "122", "123", "124", "125", "126", "127",
> "128", "129", "130", "131", "132", "133", "134", "135", "136",
> "137", "138", "139", "140", "141", "142", "143", "144", "145",
> "146", "147", "148", "149", "150"))
>
>
>
>
>
>
>
> ----- Original Message ----- 
> From: "Douglas Bates" <bates at stat.wisc.edu>
> To: "Toby Gardner" <t.gardner at uea.ac.uk>
> Cc: "R-help" <r-help at stat.math.ethz.ch>
> Sent: Thursday, September 07, 2006 1:23 AM
> Subject: Re: [R] Problem with Variance Components (and general 
> glmmconfusion)
>
>
>   
>> On 9/4/06, Toby Gardner <t.gardner at uea.ac.uk> wrote:
>>     
>>> Dear list,
>>>
>>> I am having some problems with extracting Variance Components from a 
>>> random-effects model:
>>>
>>> I am running a simple random-effects model using lme:
>>>
>>> model<-lme(y~1,random=~1|groupA/groupB)
>>>
>>> which returns the output for the StdDev of the Random effects, and model 
>>> AIC etc as expected.
>>>
>>> Until yesterday I was using R v. 2.0, and had no problem in calling the 
>>> variance components of the above model using VarCorr(model), together 
>>> with their 95% confidence intervals using intervals() - although for some 
>>> response variables a call to intervals() returns the error: Cannot get 
>>> confidence intervals on var-cov components: Non-positive definite 
>>> approximate variance-covariance.
>>>
>>> I have now installed R v. 2.3.1 and am now experiencing odd behaviour 
>>> with VarCorr(lme.object), with an error message typically being returned:
>>>
>>> Error in VarCorr(model) : no direct or inherited method for function 
>>> 'VarCorr' for this call
>>>
>>> Is this known to happen? For instance could it be due to the subsequent 
>>> loading of new packages? (lme4 for instance?).
>>>       
>> Yes.  Avoid loading lme4 and nlme simultaneously.
>>
>>     
>>> To get around this problem I have tried running the same model using 
>>> lmer:
>>>
>>> model2<-lmer(y~1 + (1|groupA) + (1|groupB))
>>>       
>> In recent versions of lme4 you can use the specification
>>
>> model2 <- lmer(y ~ 1 + (1|groupA/groupB))
>>
>> Your version may be correct or not.  It depends on what the distinct
>> levels of groupB correspond to.  The version with the / is more
>> reliable.
>>
>>     
>>> Should this not produce the same model? The variance components are very 
>>> similar but not identical, making me think that I am doing something 
>>> wrong. I am also correct in thinking that intervals() does not work with 
>>> lmer? I get: Error in intervals(model2) : no applicable method for 
>>> "intervals"
>>>       
>> That is correct.  Currently there is no intervals method for an lmer
>> model.  You can use mcmcsamp to get a Markov chain Monte Carlo sample
>> to which you can apply HPDinterval from the "coda" package.  However,
>> these are stochastic intervals so it is best to try on a couple of
>> chains to check on the reproducibility or the intervals.
>>
>>     
>>> GLMM
>>>
>>> I have a general application question - please excuse my ignorance, I am 
>>> relatively new to this and trying to find a way through the maze.  In 
>>> short I need to compile generalized linear mixed models both for (a) 
>>> Poisson data and (b) binonial data incorporating a two nested random 
>>> factors, and I need to be able to extract AIC values as I am taking an 
>>> information-theoretic approach to model selection.  Prior to sending an 
>>> email to the list I have spent quite a few days reading the background on 
>>> a number of functions, all of which offer potential for this; glmmML, 
>>> glmmPQL, lmer, and glmmADMB.  I can understand that glmmPQL is unsuitable 
>>> because there is no way of knowing the maximised likelihood, but is there 
>>> much difference between the remaining three options? I have seen 
>>> simulation comparisons published on this list between glmmADMB and 
>>> glmmPQL and lmer, but it seems these are before the latest release of 
>>> lmer, and also they do not evaluate glmmML.  To a newcomer this myria!
>>>       
>> d !
>>     
>>>  of options is bewildering, can anyone offer advice as to the most robust 
>>> approach?
>>>       
>> Goran can correct me if I am wrong but I don't believe that glmmML can
>> be used with multiple levels of random effects.
>>
>> I'm not sure what the status of glmmADMB is these days.  There was
>> some controversy regarding the license applied to some of that code a
>> while back.  I don't know if it has been resolved to everyone's
>> satisfaction.
>>
>> When using lmer I would suggest using method = "Laplace" and perhaps
>> control = list(usePQL = FALSE, msVerbose = 1) as I mentioned in
>> another reply to the list a few minutes ago.
>>
>> Let us know how it works out.
>>
>>     
>>> Many thanks for your time and patience,
>>>
>>> Toby Gardner
>>>
>>> School of Environmental Sciences
>>> University of East Anglia
>>> Norwich, NR4 7TJ
>>> United Kingdom
>>> Email: t.gardner at uea.ac.uk
>>> Website: www.uea.ac.uk/~e387495
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>       
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>     
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list