[R] results of a survival analysis change when converting the data to counting process format

Ferenci Tamas t@m@@@|erenc| @end|ng |rom med@t@t@hu
Sun Aug 18 19:10:12 CEST 2019


Dear All,

Consider the following simple example:

library( survival )
data( veteran )

coef( coxph(Surv(time, status) ~ trt + prior + karno, data = veteran) )
         trt        prior        karno 
 0.180197194 -0.005550919 -0.033771018

Note that we have neither time-dependent covariates, nor time-varying
coefficients, so the results should be the same if we change to
counting process format, no matter where we cut the times.

That's true if we cut at event times:

veteran2 <- survSplit( Surv(time, status) ~ trt + prior + karno,
                       data = veteran, cut = unique( veteran$time ) )

coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = veteran2 ) )
         trt        prior        karno 
 0.180197194 -0.005550919 -0.033771018 

But quite interestingly not true, if we cut at every day:

veteran3 <- survSplit( Surv(time, status) ~ trt + prior + karno,
                       data = veteran, cut = 1:max(veteran$time) )

coef( coxph(Surv(tstart,time, status) ~ trt + prior + karno, data = veteran3 ) )
         trt        prior        karno 
 0.180197215 -0.005550913 -0.033771016 

The difference is not large, but definitely more than just a rounding
error, or something like that.

What's going on? How can the results get wrong, especially by
including more cutpoints?

Thank you in advance,
Tamas



More information about the R-help mailing list