[R] coxph diagnostics

Andrews, Chris chrisaa at med.umich.edu
Wed Aug 14 13:52:38 CEST 2013


"Based on the plot of Schoenfeld residuals and Terry's explanation is it safe to say that proportional hazards assumption holds despite the significant global p-values?"

No.  I don't want to put words in Terry's mouth, but he seems to be saying that proportional hazards does NOT hold but it may be close enough to be useful.  This is always a problem with goodness-of-fit tests and large datasets.

Chris


-----Original Message-----
From: Soumitro Dey [mailto:soumitrodey1 at gmail.com] 
Sent: Tuesday, August 13, 2013 10:38 AM
To: Terry Therneau
Cc: r-help at r-project.org
Subject: Re: [R] coxph diagnostics

Thank you for your response, Terry.

To put the discussion into perspective, my data set is quite large with
over 160,000 samples and 38 variables. The event is true for all samples in
this dataset. The distribution is zero-inflated (i.e. most events occur at
time = 0).

The result of the cox.zph looks like this:

> cox.zph(coxph1)                          rho    chisq        p
agency1          -1.05e-02 9.06e+00 2.62e-03
agency2           -5.48e-03 2.47e+00 1.16e-01
agency3           -6.47e-03 3.45e+00 6.34e-02
agency4           -6.86e-03 3.87e+00 4.90e-02
agency5           -5.56e-03 2.54e+00 1.11e-01
agency6           -6.79e-03 3.79e+00 5.16e-02
agency7           -4.78e-03 1.88e+00 1.71e-01
agency8           -1.34e-02 1.48e+01 1.22e-04
agency9           -2.78e-03 6.34e-01 4.26e-01
agency10          -6.15e-03 3.11e+00 7.78e-02
agency11           4.82e-04 1.91e-02 8.90e-01
agency12          -4.38e-03 1.58e+00 2.09e-01
agency13          -1.02e-03 8.54e-02 7.70e-01
agency14          -5.44e-03 2.43e+00 1.19e-01
agency15           1.01e-02 8.41e+00 3.73e-03
agency16          -1.81e-03 2.70e-01 6.04e-01
agency17          -3.14e-03 8.12e-01 3.67e-01
agency18          -6.59e-03 3.57e+00 5.88e-02
agency19           1.60e-03 2.12e-01 6.46e-01
agency20          -1.24e-02 1.27e+01 3.74e-04
agency21          -9.02e-03 6.69e+00 9.68e-03
agency22          -5.84e-03 2.81e+00 9.38e-02
agency23           3.99e-03 1.31e+00 2.52e-01
agency24          -9.18e-03 6.93e+00 8.50e-03
agency25          -4.75e-03 1.86e+00 1.73e-01
category1         -1.31e-02 1.43e+01 1.60e-04
category2          1.34e-04 1.47e-03 9.69e-01
category3          7.61e-03 4.75e+00 2.92e-02
category4         -6.65e-03 3.69e+00 5.48e-02
category5         -7.78e-03 4.97e+00 2.58e-02
category6         -8.64e-03 6.12e+00 1.34e-02
fav_count          1.32e-02 1.46e+01 1.32e-04
fow_count         -1.83e-02 2.50e+01 5.70e-07
fri_count          9.20e-03 6.89e+00 8.67e-03
stat_count         1.01e-02 9.08e+00 2.58e-03
ht                 1.37e-02 1.53e+01 9.08e-05
ul                  1.36e-02 1.52e+01 9.67e-05
um                  -1.12e-02 1.04e+01 1.24e-03
pos                 -5.92e-04 2.90e-02 8.65e-01
neg                  6.44e-03 3.39e+00 6.56e-02
acti                 2.24e-03 4.12e-01 5.21e-01
anat                 3.48e-03 9.96e-01 3.18e-01
chemi               -7.82e-03 5.04e+00 2.47e-02
conc                 7.04e-05 4.08e-04 9.84e-01
devi                -1.34e-03 1.48e-01 7.01e-01
diso                -3.60e-03 1.06e+00 3.04e-01
gene                 1.31e-03 1.41e-01 7.07e-01
geog                 4.64e-03 1.78e+00 1.82e-01
livb                -1.19e-02 1.17e+01 6.24e-04
objc                 3.87e-03 1.23e+00 2.67e-01
occu                 6.06e-04 3.04e-02 8.62e-01
orga                -8.24e-04 5.63e-02 8.12e-01
phen                 3.87e-03 1.23e+00 2.68e-01
phys                -1.94e-03 3.12e-01 5.77e-01
proc                 2.23e-03 4.11e-01 5.22e-01
GLOBAL                     NA 4.20e+02 0.00e+00


The slope of the plot.cox.zph is perfectly 0 for all variables with narrow
confidence bands.

I probably should have put this details in the first post but it would have
been too long. Sorry about that.

Based on the plot of Schoenfeld residuals and Terry's explanation is it
safe to say that proportional hazards assumption holds despite the
significant global p-values?

Thanks!


On Tue, Aug 13, 2013 at 9:16 AM, Terry Therneau <therneau at mayo.edu> wrote:

> That's the primary reason for the plot: so that you can look and think.
>
> The test statistic is based on whether a LS line fit to the plot has zero
> slope.  For larger data sets you can sometimes have a "significant" p-value
> but good agreement with proportional hazards.  It's much like an example
> from Lincoln Moses' begining statistics book (now out of print, so
> rephrasing from memory).
>    "Suppose that you flip a coin 10,000 times and get 5101 heads.  What
> can you say?
>        a. The coin is not perfectly fair (p<.05).  b. But it is darn close
> to perfect! "
> As a referee I would be comfortable using that coin to start a football
> game.
>
> The Cox model gives an average hazard ratio, averaged over time.  When
> proportional hazards holds that value is a complete summary-- nothing else
> is needed.    When it does not hold, the average may still be useful, or
> not, depending on the degree of change over time.
>
> Terry Therneau
>
>
>
> On 08/13/2013 05:00 AM, r-help-request at r-project.org wrote:
>
>> Thanks to Bert and G?ran for your responses.
>>
>> To answer G?ran's comment, yes I did plot the Schoenfeld residuals using
>>
>> plot.cox.zph and the lines look horizontal (slope = 0) to me, which makes
>> me think that it contradicts the results of cox.zph.
>>
>> What alternatives do I have if I assume proportional assumption of coxph
>> does not hold?
>>
>> Thanks!
>>
>

	[[alternative HTML version deleted]]


**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the R-help mailing list