# [R] weighted Cox proportional hazards regression

Scott Bartell sbartell at uci.edu
Tue Dec 4 14:00:54 CET 2007

```I'm getting unexpected results from the coxph function when using
weights from counter-matching.  For example, the following code
produces a parameter estimate of -1.59 where I expect 0.63:

d2 = structure(list(x = c(1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
1, 0, 0, 1, 0, 1, 0, 1, 0, 1), wt = c(5, 42, 40, 4, 43, 4, 42,
4, 44, 5, 38, 4, 39, 4, 4, 37, 40, 4, 44, 5, 45, 5, 44, 5), riskset = c(1L,
1L, 4L, 4L, 6L, 6L, 12L, 12L, 13L, 13L, 19L, 19L, 23L, 23L, 31L,
31L, 42L, 42L, 45L, 45L, 70L, 70L, 93L, 93L), cc = c(1, 0, 1,
0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0
), pseudotime = rep(1,24)), .Names = c("x", "wt", "riskset",
"cc", "pseudotime"), class = "data.frame", row.names=1:24)

coxph( Surv(pseudotime, cc) ~ x + strata(riskset), weights=wt,
robust=T, method="breslow",data=d2)

I'm expecting a value of about 0.63 to 0.64 based on the data source
(simulated) and the following hand-coded MLE:

negloglik = function(beta,dat) {
dat\$wexb = dat\$wt * exp(dat\$x * beta)
agged = aggregate(dat\$wexb,list(riskset=dat\$riskset),sum)
names(agged)[2] = "denom"
dat = merge(dat[dat\$cc==1,],agged,by="riskset")
-sum(log(dat\$wexb)-log(dat\$denom))
}
nlm(negloglik,0,hessian=T,dat=d2)

Am I misunderstanding the meaning of case weights in the coxph
function?  The help file is pretty terse.

Scott Bartell, PhD
Assistant Professor
Department of Epidemiology
University of California, Irvine

```

More information about the R-help mailing list