[Rd] Re: [R] H-F corr.: covariance matrix for interaction effect

Peter Dalgaard p.dalgaard at biostat.ku.dk
Mon Feb 28 23:46:47 CET 2005


Peter Dalgaard <p.dalgaard at biostat.ku.dk> writes:

> Where I would have expected
> 
> > (20*5*0.6917-2)/(5*(19-5*.6917))
> [1] 0.8643953
> 
> Does anyone have a clue as to what is going on here? Is mighty SAS
> simply doing the wrong thing? The G-G epsilon depends only on the
> eigenvalues of the observed covariance matrix, so surely the H-F
> correction should depend only on the dimension and the DF for the
> empirical covariance matrix? 

Just in case anyone was wondering, I think I now know what SAS is
doing, and yes, it is a bug. 

The HF correction is

HFeps = (n * (k-1) * GGeps - 2) / ((k-1) * ((n-1) - (k-1) * GG.eps))

for the simple two-way layout, where the residual SSD matrix has (n-1)
degrees of freedom. For the case with covariates, it looks like (to 4
significant digits) SAS is generalizing the above to

HFeps = (n * (k-1) * GGeps - 2) / ((k-1) * (f - (k-1) * GG.eps))

where f is the degrees of freedom for the SSD. However, the first n
also needs adjustment; the correctly generalized formula should read

HFeps = ((f+1) * (k-1) * GGeps - 2) / ((k-1) * (f - (k-1) * GG.eps))

(The G-G epsilon is essentially the squared mean of the eigenvalues of
a suitably transformed SSD divided by the mean of the squares of the
eigenvalues. This is less than one unless all eigenvalues are
identical. H-F replaces numerator and denominator with bias-corrected
variants. However, since everything is a function of the SSD matrix,
sthe formula can only depend on n via the degrees of freedom.)

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907



More information about the R-devel mailing list