[R] Convergence issues when using ns splines (pkg: spline) in Cox model (coxph) even when changing coxph.control
Therneau, Terry M., Ph.D.
therneau at mayo.edu
Thu Mar 31 19:12:50 CEST 2016
Thanks to David for pointing this out. The "time dependent covariates" vignette in the
survival package has a section on time dependent coefficients that talks directly about
this issue. In short, the following model is simply wrong:
coxph(Surv(time, status) ~ trt + prior + karno + I(karno * log(time)), data=veteran)
People try this often as a way to create the time dependent covariate Karnofsky * log(t),
which is often put forwards as a way to deal with non-proportional hazards. To do this
correctly you have to use the tt() functionality in coxph to move the computation out of
the model statement:
coxph(Surv(time, status) ~ trt + prior + karno + tt(karno), data=veteran,
tt = function(x, time, ...) x*log(time))
BTW the following SAS code is also wrong:
proc phreg data=veteran;
model time * status(0) = trt + prior + karno* time;
SAS does the right thing, however, if you move the computation off the model line.
model time * status(0) = trt + karno + zzz;
zzz = karno * time;
The quote "SAS does it but R fails" comes at me moderately often in this context. The
reason is that SAS won't LET you put a phrase like "log(time)" into the model statement,
so people end up doing the right thing, but by accident.
Terry T.
On 03/30/2016 05:28 PM, Göran Broström wrote:
>
>
> On 2016-03-30 23:06, David Winsemius wrote:
>>
>>> On Mar 29, 2016, at 1:47 PM, Jennifer Wu, Miss
>>> <jennifer.wu2 at mail.mcgill.ca> wrote:
>>>
>>> Hi,
>>>
>>> I am currently using R v3.2.3 and on Windows 10 OS 64Bit.
>>>
>>> I am having convergence issues when I use coxph with a interaction
>>> term (glarg*bca_py) and interaction term with the restricted cubic
>>> spline (glarg*bca_time_ns). I use survival and spline package to
>>> create the Cox model and cubic splines respectively. Without the
>>> interaction term and/or spline, I have no convergence problem. I
>>> read some forums about changing the iterations and I have but it
>>> did not work. I was just wondering if I am using the inter.max and
>>> outer.max appropriately. I read the survival manual, other R-help
>>> and stackoverflow pages and it suggested changing the iterations
>>> but it doesn't specify what is the max I can go. I ran something
>>> similar in SAS and did not run into a convergence problem.
>>>
>>> This is my code:
>>>
>>> bca_time_ns <- ns(ins_ca$bca_py, knots=3,
>>> Boundary.knots=range(2,5,10)) test <- ins_ca$glarg*ins_ca$bca_py
>>> test1 <- ins_ca$glarg*bca_time_ns
>>
>> In your `coxph` call the variable 'bca_py' is the survival time and
>
> Right David: I didn't notice that the 'missing main effect' in fact was part of the
> survival object! And as you say: Time to rethink the whole model.
>
> Göran
>
>> yet here you are constructing not just one but two interactions (one
>> of which is a vector but the other one a matrix) between 'glarg' and
>> your survival times. Is this some sort of effort to identify a
>> violation of proportionality over the course of a study?
>>
>> Broström sagely points out that these interactions are not in the
>> data-object and subsequent efforts to refer to them may be confounded
>> by the multiple environments from which data would be coming into the
>> model. Better to have everything come in from the data-object.
>>
>> The fact that SAS did not have a problem with this rather
>> self-referential or circular model may be a poor reflection on SAS
>> rather than on the survival package. Unlike Therneau or Broström who
>> asked for data, I suggest the problem lies with the model
>> construction and you should be reading what Therneau has written
>> about identification of non-proportionality and identification of
>> time dependence of effects. See Chapter 6 of his "Modeling Survival
>> Data".
>>
>
More information about the R-help
mailing list