[Rd] Pearson residuals (PR#1123)

p.dalgaard@biostat.ku.dk p.dalgaard@biostat.ku.dk
Thu, 11 Oct 2001 13:24:34 +0200 (MET DST)


[PD blundered when replying to bug report...]

"Carmen Fernandez" <carmen@mcs.st-and.ac.uk> writes:

> No Peter, the iterative weighting matrix is diagonal with elements equal to
> the INVERSE of
> 
> V[Y_i] * (g'(mu_i))^2
> 
> which is why you get into trouble. McCullagh and Nelder (but also any other
> book on GLMs) is a good reference for this. I'm still pretty sure that
> earlier versions of R were fine.

We can easily agree that it's not right... The culprit is this

revision 1.54
date: 2001/06/07 17:43:06;  author: tlumley;  state: Exp;  lines: +1 -1
wrong sign of pearson residual for inverse link (PR#862)

diff -u -r1.53 -r1.54
--- src/library/base/R/glm.R    1 Jun 2001 11:48:34 -0000       1.53
+++ src/library/base/R/glm.R    7 Jun 2001 17:43:06 -0000       1.54
@@ -651,7 +651,7 @@
                       d.res <-
sqrt(pmax((object$family$dev.resids)(y, mu, wts), 0))
                       ifelse(y > mu, d.res, -d.res)
                   } else rep(0, length(mu)),
-                  pearson = r * sqrt(object$weights),
+                  pearson = (y-mu)/sqrt(object$weights),
                   working = r,
                   response = y - mu,

Which is listed as Thomas' doing, but I suspect it was discussed
internally, and thus a collective piece of absentmindedness.

What we have in glm.fit is

w <- sqrt((weights[good] * mu.eta.val[good]^2)/variance(mu)[good])

which later gets squared before it is returned, so the weights are
essentially 

mu.eta.val^2/variance(mu) 

which is what you are saying since the link derivatives are the
"backwards" d(mu)/d(eta). The thing that is confusing is that for a
loglinear Poisson model this comes out as mu^2/mu = mu. AFAIR, similar
cancellation happens in all canonical GLIMs.

For the identity link, it comes out as 1/mu, which is why it seemingly
worked to switch from divide to multiply....

(And of course in ordinary weighted regression, the weights *are* 
usually the reciprocal variances.)

Cc'ed to r-bugs, hoping that someone with a clear mind will do the
right thing...


-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)             FAX: (+45) 35327907

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._