[Rd] Bug in getVarCov.gls method (PR#9752)

Tue Jun 26 18:13:42 CEST 2007

This is a multi-part message in MIME format.
--------------000206090008050103050209
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Two attachments:

1. getVarCovBugReport.R  - Rcode with an example illustrating the 
problem and how to fix it
2. getVarCovBugReportSession.txt contains code and R session results.

Thank you

Andrzej Galecki

Prof Brian Ripley wrote:

> On Mon, 25 Jun 2007, agalecki at umich.edu wrote:
>
>> I am using R2.5 under Windows.
>
>
> I presume you mean 2.5.0 (there is no R2.5: see the posting guide).  
> But which version of nlme, which is the relevant fact here?  The R 
> posting guide suggests showing the output of sessionInfo() to 
> establish the environment used.
>
>> Looks like the following statement
>>
>> vars  <-  (obj$sigma^2)*vw
>>
>> in getVarCov.gls method (nlme package) needs to  be replaced with:
>>
>> vars <- (obj$sigma*vw)^2
>
>
> We need some evidence!  Please supply a reproducible example and give 
> your reasoning.  For example, the FAQ says
>
>   The most important principle in reporting a bug is to report _facts_,
>   not hypotheses or categorizations.  It is always easier to report the
>   facts, but people seem to prefer to strain to posit explanations and
>   report them instead.  If the explanations are based on guesses about 
> how
>   R is implemented, they will be useless; others will have to try to 
> figure
>   out what the facts must have been to lead to such speculations.
>   Sometimes this is impossible.  But in any case, it is unnecessary work
>   for the ones trying to fix the problem.
>
>   It is very useful to try and find simple examples that produce
>   apparently the same bug, and somewhat useful to find simple examples
>   that might be expected to produce the bug but actually do not.  If you
>   want to debug the problem and find exactly what caused it, that is
>   wonderful.  You should still report the facts as well as any
>   explanations or solutions. Please include an example that reproduces 
> the
>   problem, preferably the simplest one you have found.
>
> It should be easily possible to cross-check an example with one of the 
> many other ways available to do GLS fits in R.
>
> [...]
>
>

--------------000206090008050103050209
Content-Type: text/plain;
 name="getVarCovBugReport.R"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="getVarCovBugReport.R"

ls()
require(nlme)
sessionInfo()
str(Orthodont)
formula(Orthodont)

# variances of <distance> variable at four different ages  
attach(Orthodont)
d8   <- distance[age==8]
d10  <- distance[age==10]
d12  <- distance[age==12]
d14  <- distance[age==14]
vars0<- diag(var(cbind(d8,d10,d12,d14)))
vars0
detach(Orthodont)

# Model with four means and unstructured covariance matrix 
# to _illustrate_ that getVarCov does not work properly
# It is expected that marginal variances from the following model 
# will be close to <vars0>. 
gls.fit <- gls(distance ~ -1 + factor(age),
    correlation= corSymm(form=~1|Subject), 
    weights=varIdent(form=~1|age),
    data= Orthodont)

# V matrix extracted using getVarCov 
Vmtx <- getVarCov(gls.fit,individual="M01") 
vars.incorrect <- diag(Vmtx)     # incorrect values on the diagonal 
vars.incorrect

# Manualy calculated variances (correct values) based on gls.fit
# Code used here is similar to getVarCov, with one statement corrected 
obj   <- gls.fit
ind   <- obj$groups == "M01"
vw    <- 1/varWeights(obj$modelStruct$varStruct)[ind]  # from getVarCov
vars  <- (obj$sigma *vw)^2   # Corrected statement   

# Please note that vars.corrected is very close to
# vars. (difference most likely due to rounding error)
vars.corrected <- vars.incorrect * vw  
vars.corrected 

--------------000206090008050103050209
Content-Type: text/plain;
 name="getVarCovBugReportSession.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="getVarCovBugReportSession.txt"

> ls()
character(0)
> require(nlme)
Loading required package: nlme
[1] TRUE
> sessionInfo()
R version 2.5.0 (2007-04-23) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"  
[7] "base"     

other attached packages:
    nlme 
"3.1-80" 
> #str(Orthodont)
> #formula(Orthodont)
> 
> # variances of <distance> variable at four different ages  
> attach(Orthodont)
> d8   <- distance[age==8]
> d10  <- distance[age==10]
> d12  <- distance[age==12]
> d14  <- distance[age==14]
> vars0<- diag(var(cbind(d8,d10,d12,d14)))
> vars0
      d8      d10      d12      d14 
5.925926 4.653846 7.938746 7.654558 
> detach(Orthodont)
> 
> 
> # Model with four means and unstructured covariance matrix 
> # to _illustrate_ that getVarCov does not work properly
> # It is expected that marginal variances from the following model 
> # will be close to <vars0>. 
> gls.fit <- gls(distance ~ -1 + factor(age),
+     correlation= corSymm(form=~1|Subject), 
+     weights=varIdent(form=~1|age),
+     data= Orthodont)
> 
> # V matrix extracted using getVarCov 
> Vmtx <- getVarCov(gls.fit,individual="M01") 
> vars.incorrect <- diag(Vmtx)     # incorrect values on the diagonal 
> vars.incorrect
[1] 5.925941 5.251514 6.858886 6.735022
> 
> # Manualy calculated variances (correct values) based on gls.fit
> # Code used here is similar to getVarCov, with one statement corrected 
> obj   <- gls.fit
> ind   <- obj$groups == "M01"
> vw    <- 1/varWeights(obj$modelStruct$varStruct)[ind]  # from getVarCov
> vars  <- (obj$sigma *vw)^2   # Corrected statement   
> 
> 
> # Please note that vars.corrected is very close to
> # vars. (difference most likely due to rounding error)
> vars.corrected <- vars.incorrect * vw  
> vars.corrected 
       8       10       12       14 
5.925941 4.653843 7.938708 7.654569 
> 

--------------000206090008050103050209--