[R] Convergence issues when using ns splines (pkg: spline) in Cox model (coxph) even when changing coxph.control

Therneau, Terry M., Ph.D. therneau at mayo.edu
Thu Mar 31 19:12:50 CEST 2016

Thanks to David for pointing this out.  The "time dependent covariates" vignette in the 
survival package has a section on time dependent coefficients that talks directly about 
this issue.  In short, the following model is simply wrong:
      coxph(Surv(time, status) ~ trt + prior + karno + I(karno * log(time)), data=veteran)

People try this often as a way to create the time dependent covariate  Karnofsky * log(t), 
which is often put forwards as a way to deal with non-proportional hazards.  To do this 
correctly you have to use the tt() functionality in coxph to move the computation out of 
the model statement:
       coxph(Surv(time, status) ~ trt + prior + karno + tt(karno), data=veteran,
	    tt = function(x, time, ...) x*log(time))

BTW the following SAS code is also wrong:
      proc phreg data=veteran;
          model time * status(0) = trt + prior + karno* time;

SAS does the right thing, however, if you move the computation off the model line.
	  model time * status(0) = trt + karno + zzz;
           zzz = karno * time;

The quote "SAS does it but R fails" comes at me moderately often in this context.  The 
reason is that SAS won't LET you put a phrase like "log(time)" into the model statement, 
so people end up doing the right thing, but by accident.

Terry T.

On 03/30/2016 05:28 PM, Göran Broström wrote:
> On 2016-03-30 23:06, David Winsemius wrote:
>>> On Mar 29, 2016, at 1:47 PM, Jennifer Wu, Miss
>>> <jennifer.wu2 at mail.mcgill.ca> wrote:
>>> Hi,
>>> I am currently using R v3.2.3 and on Windows 10 OS 64Bit.
>>> I am having convergence issues when I use coxph with a interaction
>>> term (glarg*bca_py) and interaction term with the restricted cubic
>>> spline (glarg*bca_time_ns). I use survival and spline package to
>>> create the Cox model and cubic splines respectively. Without the
>>> interaction term and/or spline, I have no convergence problem. I
>>> read some forums about changing the iterations and I have but it
>>> did not work. I was just wondering if I am using the inter.max and
>>> outer.max appropriately. I read the survival manual, other R-help
>>> and stackoverflow pages and it suggested changing the iterations
>>> but it doesn't specify what is the max I can go. I ran something
>>> similar in SAS and did not run into a convergence problem.
>>> This is my code:
>>> bca_time_ns <- ns(ins_ca$bca_py, knots=3,
>>> Boundary.knots=range(2,5,10)) test <- ins_ca$glarg*ins_ca$bca_py
>>> test1 <- ins_ca$glarg*bca_time_ns
>> In your `coxph` call the variable 'bca_py' is the survival time and
> Right David: I didn't notice that the 'missing main effect' in fact was part of the
> survival object! And as you say: Time to rethink the whole model.
> Göran
>> yet here you are constructing not just one but two interactions (one
>> of which is a vector but the other one a matrix) between 'glarg' and
>> your survival times. Is this some sort of effort to identify a
>> violation of proportionality over the course of a study?
>> Broström sagely points out that these interactions are not in the
>> data-object and subsequent efforts to refer to them may be confounded
>> by the multiple environments from which data would be coming into the
>> model. Better to have everything come in from the data-object.
>> The fact that SAS did not have a problem with this rather
>> self-referential or circular model may be a poor reflection on SAS
>> rather than on the survival package. Unlike Therneau or Broström who
>> asked for data, I suggest the problem lies with the model
>> construction and you should be reading what Therneau has written
>> about identification of non-proportionality and identification of
>> time dependence of effects. See Chapter 6 of his "Modeling Survival
>> Data".

More information about the R-help mailing list