[R] Nelson-Aalen estimator of cumulative hazard
rvaradhan at jhmi.edu
Mon May 4 06:32:54 CEST 2009
I am computing the Nelson-Aalen (NA) estimate of baseline cumulative hazard in two different ways using the "survival" package. I am expecting that they should be identical. However, they are not. Their difference is a monotonically increasing with time. This difference is probably not large to make any impact in the application, but is annoyingly non-trivial for me to just ignore it.
This is a competing risks problem, with the Green & Byar (1980) data set (the STATA data set is attached).
Can anyone explain to me the reason for the discrepancy?
gb <- read.dta("GB.dta") # Green & Byar data; N = 483
# Method 1
fit1 <- coxph( Surv(time, status=="Cancer" | status=="CVD" | status=="Other") ~ 1, data=gb)
h1 <- basehaz(fit1)
# Method 2
fit2 <- survfit(Surv(time, status=="Cancer" | status=="CVD" | status=="Other") ~ 1, data=gb)
jump <- fit2$n.event > 0
h2 <- cumsum(fit2$n.event[jump]/fit2$n.risk[jump])
plot(h1$time, h1$hazard - h2)
Ravi Varadhan, Ph.D.
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University
Ph. (410) 502-2619
email: rvaradhan at jhmi.edu
More information about the R-help