[R] Apparently Conflicting Results with coxph

Kevin E. Thorpe kevin.thorpe at utoronto.ca
Mon Oct 1 15:27:02 CEST 2007


Dear List:

I have a data frame prepared in the couting process style for including
a binary time-dependent covariate.  The first few rows look like this.

    PtNo Start    End Status Imp
1      1     0  608.0      0   0
2      2     0  513.0      0   0
3      2   513  887.0      0   1
4      3     0   57.0      0   0
5      3    57  604.0      0   1
6      4     0  150.0      1   0


The outcome is mortality and the covariate is for an implantable
defibrillator, so it is expected that the implant would reduce the
risk of death.  The results of fitting coxph (survival package) are:

Call:
coxph(formula = Surv(Start, End, Status) ~ Imp, data = nina.excl)


     coef exp(coef) se(coef)     z    p
Imp 0.163      1.18    0.485 0.337 0.74

Likelihood ratio test=0.11  on 1 df, p=0.738  n= 335

Since this was unexpected, I created a non-counting process data
frame with an indicator variable representing received an implant
or not.  Here are the results:

Call:
coxph(formula = Surv(Days, Dead) ~ Implant, data = nina.excl0)


         coef exp(coef) se(coef)     z       p
Implant -1.77     0.171    0.426 -4.15 3.3e-05

Likelihood ratio test=19.1  on 1 df, p=1.21e-05  n= 197

I found this degree of discrepancy surprising, especially the point
estimate of the coefficient.  I have verified the data frames are
set up correctly.

Here is what I have tried to understand what is going on.

I tried fitting models adjusted for other covariates that I have in
the data frame.  This did not appreciably affect the coefficients
for the implant variable.

I ran cox.zph on the two models shown above and plotted the results.
In both cases, the point estimate of Beta(t) is sort of parabolic
in that the curves are monotonically increasing to a local maximum
after which they are monotonically decreasing (the CIs are a bit
more wiggly).

I would interpret this to mean that the effect of implant is probably
time-dependent.  If so, how do I actually get a "proper" estimate of
beta(t) for a variable like this?

Are there some other things I should look at to understand what's
going on?

Here is my sessionInfo.
R version 2.5.0 (2007-04-23)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] "splines"   "stats"     "graphics"  "grDevices" "utils"     "datasets"
[7] "methods"   "base"

other attached packages:
  cmprsk survival
 "2.1-7"   "2.31"


-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057



More information about the R-help mailing list