[R] changes in coxph in "survival" from older version?

Shi, Tao shidaxia at yahoo.com
Mon May 16 19:56:42 CEST 2011


Hi Terry,

Really appreciate your help!  Sorry for my late reply.

I did realize that there are way more predictors in the model.  My initial 
thinking was use that as an initial model for stepwise model selection.  Now I 
wonder if the model selection result is still valid if the initial model didn't 
even converge?

Thanks!

...Tao




----- Original Message ----
> From: Terry Therneau <therneau at mayo.edu>
> To: "Shi, Tao" <shidaxia at yahoo.com>
> Cc: r-help at r-project.org
> Sent: Thu, May 12, 2011 6:42:09 AM
> Subject: Re: changes in coxph in "survival" from older version?
> 
> 
> On Wed, 2011-05-11 at 16:11 -0700, Shi, Tao wrote:
> > Hi all,
> > 
> > I found that the two different versions of "survival" packages, namely  
>2.36-5 
>
> > vs. 2.36-8 or later, give different results for coxph  function.  Please see 

> > below and the data is attached.  The  second one was done on Linux, but 
>Windows 
>
> > gave the same results.   Could you please let me know which one I should 
>trust?
> > 
> >  Thanks,
> 
>  In your case, neither.  Your data set has 22 events and 17  predictors;
> the rule of thumb for a reliable Cox model is 10-20 events per  predictor
> which implies no more than 2 for your data set.  As a result,  the
> coefficients of your model have very wide confidence intervals, the  coef
> for Male for instance has se of 3.26, meaning the CI goes from 1/26  to
> 26 times the estimate; i.e., there is no biological meaning to  the
> estimate.
> 
>   Nevertheless, why did coxph give a different  answer?  The later
> version 2.36-9 failed to converge (20 iterations)  with a final
> log-likelihood of -19.94, the earlier code converges in 10  iterations to
> -19.91.  In version 2.36-6 an extra check was put into the  maximizer for
> coxph in response to an exceptional data set which caused the  routine to
> fail due to overflow of the exp function; the Newton-Raphson  iteration
> algorithm had made a terrible guess in it's iteration path, which  can
> happen with all NR based search methods.  
>    I put a limit  on the size the linear predictor in the Cox model of
> 21.  The basic  argument is that exp(linear-predictor) = relative risk
> for a subject, and  that there is not much biological meaning for risks
> to be less than exp(-21)  ~ 1/(population of the earh).  There is more to
> the reasoning,  interested parties should look at the comments in
> src/coxsafe.c, a 5 line  routine with 25 lines of discussion.  I will
> happily accept input the  "best" value for the constant.
> 
>    I never expected to see a data set  with both convergence of the LL
> and linear predictors larger than +-15.   Looking at the fit (older code)
> > round(fit2$linear.predictor, 2)
>   [1]   2.26   0.89   4.96 -19.09 -12.10   1.39    2.82   3.10
>  [9]  18.57 -25.25  22.94   8.75   5.52  -27.64  14.88 -23.41
> [17]  13.70 -28.45  -1.84   10.04  12.62   2.54   6.33  -8.76
> [25]   9.68    4.39   2.92   3.51   6.02 -17.24   5.97
> 
> This says  that, if the model is to be believed, you have several near
> immortals in the  data set. (Everyone else on earth will perish first).
> 
> Terry  Therneau
> 
> 
> 
>



More information about the R-help mailing list