[R] Post-fit survival curve for shared frailty model

Fri May 31 15:01:18 CEST 2013

Hi,

I have fitted a shared frailty model using coxph in the survival
package with 'center' as random effect. Now I'm trying to obtain the
conditional survival curve for a new test case that comes from one of
the specified centers in the model.

The data set on which the model is fitted contains 1000 individuals
coming from 10 different centers, with 100 individuals in each center.

The fitted model has the following form:

model.frail = coxph(Surv(time,status) ~ x1 + frailty(center), data = d)

My new test case belongs to center no. 5 and has x1-value equal to 0.
My attempt to obtain the conditional survival function:

sfit = survfit(model.frail, newdata=data.frame(x1 = 0, center = 5))

This generates the error:
Error in survfit.coxph(model.frail, newdata = data.frame(x1 = 0,  :
  Newdata cannot be used when a model has sparse frailty terms

Am I specifying the newdata argument in the wrong way? If so, how
should it be specified?

I found an earlier note on the issue of frailty and survival curves by
Terry indicating that it is indeed possible to obtain the conditional
survival curve:

BEGINQUOTE

With respect to Cox models + frailty, and post-fit survival curves.

 1. There are two possible survival curves, the conditional curve
where we identify which center a subject comes from, and the marginal
curve where we have integrated out center and give survival for an
"unspecified" individual.  I find the first more useful.  More
importantly to your case, the survival package currently has no code
to calculate the second of these.

 2. When the number of centers is large the coxph code may have used a
sparse approximation to the variance matrix, for speed reasons.  In
this particular case one cannot use the "newdata" argument.  The
reason is entirely practical --- the code turned out to be very hard
to write. The need for this comes up very rarely, and the work around
is to use
   coxph(....... + frailty(center, sparse=1000, ....)
where we set the "sparse computation" threshold to be some number
larger than the number of centers, i.e., force non-sparse computation.

Terry Therneau

ENDQUOTE

Thanks in advance.

/Ulf