[R] Discrepant lm() and survreg() standard errors with weighted fits

Kyle Penner kpenner at as.arizona.edu
Tue Feb 25 21:29:15 CET 2014


> Survreg treats weights as case weights, and lm treats them as sampling weights.
> Here is a simple example.  Data set test2 has two copies of every obs in data set test.
>
> > test <- data.frame(x=1:6, y=c(1,3,2,4,6,5))
> > test2 <- test[c(1:6, 1:6),]
>
> > summary(lm( y ~ x, data=test))$coef
>               Estimate Std. Error   t value   Pr(>|t|)
> (Intercept) 0.4000000  0.9039595 0.4424977 0.68100354
> x           0.8857143  0.2321154 3.8158362 0.01884548
>
> > summary(lm( y~x, data=test2))$coef
>               Estimate Std. Error   t value    Pr(>|t|)
> (Intercept) 0.4000000  0.5717142 0.6996503 0.500096805
> x           0.8857143  0.1468027 6.0333668 0.000126369
>
> As expected, the standard error has decreased by a factor of sqrt(2)
> Now fit the model using case weights:
>
> > summary(lm( y~x, data=test, weight= rep(2,6)))$coef
>               Estimate Std. Error   t value   Pr(>|t|)
> (Intercept) 0.4000000  0.9039595 0.4424977 0.68100354
>
>  Notice that the answer matches the first run with data set test.  Repeat this experiment
> with survreg, and you will find that the weighted run matches data test2.  When using the
> robust variance, survreg treats weights as sampling weights, not case weights.

When I use robust=F, I now understand the results:

> summary(survreg(Surv(y)~x, dist='gaussian', data=test, weights=rep(2,6)))$table
                 Value Std. Error          z            p
(Intercept)  0.4000000  0.5219013  0.7664285 4.434214e-01
x            0.8857143  0.1340119  6.6092222 3.863445e-11
Log(scale)  -0.2321528  0.2041241 -1.1373118 2.554080e-01

When I use robust=T, I do not understand how survreg treats the
weights as sampling weights and arrives at a different standard error
from lm:

> summary(survreg(Surv(y)~x, dist='gaussian', data=test, weights=rep(2,6), robust=T))$table
                 Value   Std. Err (Naive SE)        z            p
(Intercept)  0.4000000 0.29426260  0.5219013  1.35933 1.740420e-01
x            0.8857143 0.08384353  0.1340119 10.56390 4.380958e-26
Log(scale)  -0.2321528 0.08117684  0.2041241 -2.85984 4.238543e-03




More information about the R-help mailing list