[Rd] Q-Q plot scaling in plot.lm(); bug or thinko?

Ben Bolker bbolker at gmail.com
Tue Oct 15 03:54:58 CEST 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


I've been looking fairly carefully at the Q-Q plots produced by
plot.lm() and am having difficulty understanding why plot.lm()
is doing what it's doing, specifically scaling the standardized
residuals by the prior weights. Can anyone explain this to me ... ?


Multiplying by the weights seems to give the wrong plot, at least for
binomial models; at the very least, it means that the $y$ (observed
quantiles) and $x$ (expected quantiles) scales are different from each
other even when the model is exactly correct ... although if the
weights are all the same, it doesn't change the linearity of the
Q-Q plot, but it does seem confusing if one is explicitly comparing
expected to observed ...

The help page says

> The ?S-L?, the Q-Q, and the Residual-Leverage plot, use 
> _standardized_ residuals which have identical variance (under the 
> hypothesis).  They are given as R[i] / (s * sqrt(1 - h.ii)) where 
> h.ii are the diagonal entries of the hat matrix

An example (in R markdown):

```{r}
set.seed(101)
n <- 1000  ## number of observations
N <- 100   ## binomial sample size
d <- data.frame(r=rbinom(n,size=N,prob=0.5),
                num=N)
## fit with proportion/weights syntax
g1 <- glm(r/N~1,family=binomial,weights=num,data=d)
```

The scale of the standardized residuals looks appropriate:
```{r}
range(rstandard(g1))
```

Showing the results of `plot.lm()`, `qqnorm()` on standardized
residuals, and th enhanced Q-Q plot from the `mgcv` package:
```{r fig.width=10,fig.height=5}
par(mfrow=c(1,3),las=1,bty="l")
plot(g1,which=2,main="plot.lm()")
qqnorm(rstandard(g1),main="qqnorm()")
mgcv::qq.gam(g1,pch=1,main="mgcv::qq.gam()")
```

`rstandard()` uses

```{r eval=FALSE}
res <- res/sqrt(summary(model)$dispersion * (1 - infl$hat))
```

`plot.lm()` uses

```{r eval=FALSE}
r <- residuals(x)
## ... some lines skipped ...
r.w <- if (is.null(w)) {
           r
       } else sqrt(w) * r
rs <- dropInf(r.w/(s * sqrt(1 - hii)), hii)
```
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSXKByAAoJEOCV5YRblxUH7GIIANdaisPNlYAM65BRJc+U62uQ
gdNfv7xnGhReVRvngkXWCiFWT/54WdJfODPe/gk9uHOfDaQjvyCkicDdmMwq/r5H
2IKGV/1vh9biJr4c89tr3I13Y7hpLcEU5LC+uCSbbsAFBxgmiyNRbZZolA6JI9Vx
ty4QBCSOLcFROmygr1MiLZ4S9DtK7J1s4ToT+iBCHuIT6C+qxc3YrF1NusV1kbpy
IBIRvK44JZJ+1pPrNSfP5v59pZWy9fCyokhX9Rsygx6CSD75YYcQwZ+SHJD7gZ2d
Ev2XPG1qzUeAjpZId/T08Gir7VcbMdeLNIRVIiAAgxTSwmJpWkapsBEgzrE+IDM=
=XOSY
-----END PGP SIGNATURE-----



More information about the R-devel mailing list