[R] Modelling survival with time-dependent covariates

Ben Rhelp benrhelp at yahoo.co.uk
Thu Jul 1 21:28:50 CEST 2010


Hi all,

I am looking at the tutorial/appendix from John Fox on “Cox Proportional-Hazards Regression for Survival Data” available here:
http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
I am particularly interested in modelling survival with time-dependent covariates (Section 4).
 
The data look like this:
>  Rossi.2[1:50,]
start
stop arrest.time week arrest fin age race wexp mar paro prio educ employed
0 1 0 20 1 0 27 1 0 0 1 3 3 0
1 2 0 20 1 0 27 1 0 0 1 3 3 0
...
18 19 0 20 1 0 27 1 0 0 1 3 3 0
19 20 1 20 1 0 27 1 0 0 1 3 3 0
0 1 0 17 1 0 18 1 0 0 1 8 4 0
1 2 0 17 1 0 18 1 0 0 1 8 4 0
...
15 16 0 17 1 0 18 1 0 0 1 8 4 0
16 17 1 17 1 0 18 1 0 0 1 8 4 0
0 1 0 25 1 0 19 0 1 0 1 13 3 0
1 2 0 25 1 0 19 0 1 0 1 13 3 0
...
3.13 12 13 0 25 1 0 19
0 1 0 1 13 3 0
 
John suggests the following model:
mod.allison.2 <- coxph(Surv(start, stop, arrest.time) ~
+ fin + age + race + wexp + mar + paro + prio + employed,
+ data=Rossi.2)
 1-Would informing the algorithm coxph which samples represents the same person (through the use of an Id for example) improve the “efficiency” of the estimated model? And if so, how should i do that? Using strata()?
 
2- He later suggests “accommodating non-proportional hazards by building interactions between covariates and time into the Cox regression model” as follows:
 
mod.allison.5
<- coxph(Surv(start, stop, arrest.time) ~
+           fin + age + age:stop + prio,
+           data=Rossi.2)
 
I have read quite a lot of documentation to understand the meaning of “age + age:stop” in the formula, but I am unsure of what it means. If I wanted to  visualise these variables which are entering the model, would it be something like:
data.frame(Rossi.2$age,Rossi.2$age %in% Rossi.2$stop)
 
I hope this make sense. Thanks for your help,
Ben






More information about the R-help mailing list