[R] Weighted Kaplan-Meier estimates with R

Terry Therneau therneau at mayo.edu
Tue Mar 26 14:00:55 CET 2013


There are two ways to view weights.  One is to treat them as case weights, i.e., a weight 
of 3 means that there were actually three identical observations in the primary data, 
which were collapsed to a single observation in the data frame to save space.  This is the 
assumption of survfit.  (Most readers of this list will be too young to remember when 
computer memory was so small that we had to use tricks like this.)  The second assumption 
is that the weights are sampling weights and a Horvitz-Thompsen like estimator should be 
used for variance. This is the assumption of the svykm program in the survey package.  It 
appears you want the second behavior.

Terry Therneau


On 03/26/2013 06:00 AM, r-help-request at r-project.org wrote:
> As part of a research paper, I would like to draw both weighted and
> unweighted Kaplan-Meier estimates, the weight being the ?importance? of the
> each project to the mass of projects whose survival I?m trying to estimate.
>
> I know that the function survfit in the package survival accepts weights and
> produces confidence intervals. However, I suspect that the confidence
> intervals may not be correct. The reason why I suspect this is that
> depending on how I define the weights, I get very different confidence
> intervals, e.g.
>
> require(survival)
> s<- Surv(c(50,100),c(1,1))
> sf<- survfit(s~1,weights=c(1,2))
> plot(sf)
>
> vs.
>
> require(survival)
> s<- Surv(c(50,100),c(1,1))
> sf<- survfit(s~1,weights=c(100,200))
> plot(sf)
>
> Any suggestions would be more than welcome!
>



More information about the R-help mailing list