[R] Age as time-scale in a cox model
Terry Therneau
therneau at mayo.edu
Thu Feb 19 15:50:14 CET 2009
You asked about survival curves with age scale versus follow-up scale.
> fit1 <- coxph(Surv(time/365.25, status) ~ t5 + id + age, data=stanford2)
> surv1<- survfit(fit1)
> surv1
n events median 0.95LCL 0.95UCL
157.000 102.000 1.999 0.898 3.608
> summary(surv1, times=3)
time n.risk n.event survival std.err lower 95% CI upper 95% CI
3 46 85 0.451 0.0425 0.375 0.543
I've taken the liberty of rewriting your query using the standard survival
library calls instead of Design, since I don't attempt to keep up with the
latter. The above shows a median survival of 1.999 years after enrollment, and
a 3 year survival of 45%. I was surprised when you put "id" in the model, but
it turns out to have p=.03! It seems that patients entered later in the study
have better survival.
Now for age scale:
> fit2 <- coxph(Surv(age, age+ time/365.25, status) ~ t5 + id, stanford2)
> surv2<- survfit(fit2)
> surv2
n events median 0.95LCL 0.95UCL
1.0 102.0 12.2 12.2 28.1
This shows a median age at death of 12.2 years. Puzzling, isn't it.
First, note that your code
cph(Surv(age,age+time, status) ~ t5+id, data=stanford2...
doesn't make sense due to different time scales: age in years and time in days.
As to your final question:
>These are obviously out-of sync, so there must be some way I can adjust them to
>mean the same thing. The first means the probability of surviving a 1000 days
>since they started being followed up while the second means the probability of
>surviving up to starting age+1000 days. How do I get the equivalent risks from
>the two models?
The first fit is on a "time since entry" scale, and so the survival curve is
with respect to time since entry. The second is on an age scale, and so the
curve will be in terms of absolute age, not "starting age + x". There is no
simple way to realign them.
As to the curve above with a median age of 12.2 years. We know that a usual
Kaplan-Meier curve can become unstable at the right hand end due to very small n
(<=5), which leads to big steps. With start,stop data this can happen at the
left end too. In the stanford2 data set there is one subject who enters the
study at age 12 and dies at age 12.2. At the time of death there is only 1
person at risk, so the survival curve goes to zero (100% death rate). This
curve is mathematically correct, but not at all useful.
Terry Therneau
More information about the R-help
mailing list