[R] using survreg() in survival package with "long" data

Fox, John jfox at mcmaster.ca
Mon Aug 31 17:09:36 CEST 2015


Dear Terry,

Thank you for the extended explanation -- it's helpful. 

Best,
 John

________________________________________
From: Therneau, Terry M., Ph.D. [therneau at mayo.edu]
Sent: August 31, 2015 9:56 AM
To: r-help at r-project.org; Fox, John; Göran Broström
Subject: Re: using survreg() in survival package with "long" data

On 08/30/2015 05:00 AM, r-help-request at r-project.org wrote:
> I'm unable to fit a parametric survival regression using survreg() in the survival package with data in "counting-process" ("long") form.
>
> To illustrate using a scaled-down problem with 10 subjects (with data placed on the web):
>

As usual I'm a day late since I read digests, and Goran has already clarified things.  A
discussion of this is badly needed in my as yet unwrritten book on using the survival
package.  From a higher level view:
   If an observation is interval censored (a,b) then one knows that the event happened
between time "a" and time "b", but not when.  The survreg routine can handle interval
censored data since it is parametric (you need to integrate over the interval).  The
interval (-infinity, b) is called 'left censored' and the interval (a, infinity) is 'right
censored'.  Left censored data is rare in medical work, an example might be a chronic
disease like rhuematoid arthritis where we know that the true disease onset was some time
before the date it was first detected, and one is trying to deduce the duration of disease.

   Left truncation at time 'a' means that any events before time "a" are not in the data
set.  In a referral center like mine this includes any subjects who die before they come
to us.  The coxph model handles left truncation naturally via its counting process
formulation.  That same formulation also allows it to deal with time dependent
covariates.   Accelerated failure time models like survreg can handle left truncation in
principle, but they require that the values of any covariates are known from time 0 --
even for a truncated subject.   I have never added left-truncation to the survreg code,
mostly because I have never needed it myself, but also because users would immediately
think that they could accomplish time-dependent covariates by simply using a long format
data set. Rather, each subject needs to be linked to a full covariate history, which is a
bit more work.

  So:  coxph does left truncation but not left (or interval) censoring
       survreg does interval censoring but not left truncation (or time dependent covariates).

Terry T





More information about the R-help mailing list