[R] Nelson-Aalen estimator of cumulative hazard

Ravi Varadhan rvaradhan at jhmi.edu
Mon May 4 06:32:54 CEST 2009


I am computing the Nelson-Aalen (NA) estimate of baseline cumulative hazard in two different ways using the "survival" package.  I am expecting that they should be identical.  However, they are not. Their difference is a monotonically increasing with time.  This difference is probably not large to make any impact in the application, but is annoyingly non-trivial for me to just ignore it.  

This is a competing risks problem, with the Green & Byar (1980) data set (the STATA data set is attached).

Can anyone explain to me the reason for the discrepancy?


gb <- read.dta("GB.dta")  # Green & Byar data; N = 483

# Method 1

fit1 <- coxph( Surv(time, status=="Cancer" | status=="CVD" | status=="Other") ~ 1, data=gb) 

h1 <- basehaz(fit1)

# Method 2

fit2 <- survfit(Surv(time, status=="Cancer" | status=="CVD" | status=="Other") ~ 1, data=gb)

jump <- fit2$n.event > 0

h2 <-  cumsum(fit2$n.event[jump]/fit2$n.risk[jump])

plot(h1$time, h1$hazard - h2)

Thank you,

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu

More information about the R-help mailing list