[Rd] Standardized Pearson residuals

peter dalgaard pdalgd at gmail.com
Thu Mar 17 15:45:01 CET 2011

On Mar 16, 2011, at 23:34 , John Maindonald wrote:

> One can easily test for the binary case and not give the statistic in that case.

Warning if expected cell counts < 5 would be another possibility. 

> A general point is that if one gave no output that was not open to abuse,
> there'd be nothing given at all!  One would not be giving any output at all
> from poisson or binomial models, given that data that really calls for 
> quasi links (or a glmm with observation level random effects) is in my
> experience the rule rather than the exception!

Hmmm. Not sure I agree on that entirely, but that's a different discussion.

> At the very least, why not a function dispersion() or pearsonchisquare()
> that gives this information.

Lots of options here.... Offhand, my preference would go to something like
anova(..., test="score") and/or an extra line in summary(). It's not a computationally intensive item as far as I can see, it's more about "output real estate" -- how "SAS-like" do we want to become?

> Apologies that I misattributed this.

Never mind...

Back to the original question: 

The current rstandard() code reads

## FIXME ! -- make sure we are following "the literature":
rstandard.glm <- function(model, infl = lm.influence(model, do.coef=FALSE), ...)
    res <- infl$wt.res # = "dev.res"  really
    res <- res / sqrt(summary(model)$dispersion * (1 - infl$hat))
    res[is.infinite(res)] <- NaN

which is "svn blame" to ripley but that is due to the 2003 code reorganization (except for the infinity check from 2005). So apparently, we have had that FIXME since forever... and finding its author appears to be awkward (Maechler, perhaps?).

I did try Bretts code in lieu of the above (with a mod to handle $dispersion) and even switched the default to use the Pearson residuals. Make check-devel sailed straight through apart from the obvious code/doc mismatch, so we don't have any checks in place nor any examples using rstandard(). I rather strongly suspect that there aren't many user codes using it either.

It is quite tempting simply to commit the change (after updating the docs). One thing holding me back though: I don't know what "the literature" refers to.

Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

More information about the R-devel mailing list