[R] predict.coxph and predict.survreg

David Winsemius dwinsemius at comcast.net
Thu Nov 11 19:33:11 CET 2010


On Nov 11, 2010, at 12:14 PM, Michael Haenlein wrote:

> Thanks for the comment, James!
>
> The problem is that my initial sample (Dataset 1) is truncated. That  
> means I
> only observe "time to death" for those individuals who actually died  
> before
> end of my observation period. It is my understanding that this type of
> truncation creates a bias when I use a "normal" regression analysis.  
> Hence
> my idea to use some form of survival model.
>
> I had another look at predict.survreg and I think the option  
> "response"
> could work for me.
> When I run the following code I get ptime = 290.3648.
> I assume this means that an individual with ph.ecog=2 can be  
> expected to
> life another 290.3648 days before death occurs [days is the time  
> scale of
> the time variable).

It is a prediction under specific assumptions underpinning a  
parametric estimate.

> Could someone confirm whether this makes sense?

You ought to confirm that it "makes sense" by comparing to your data:
reauire(Hmisc); require(survival)
<your code>

 > describe(lung[lung$status==1&lung$ph.ecog==2,"time"])
lung[lung$status == 1 & lung$ph.ecog == 2, "time"]
       n missing  unique    Mean
       6       0       6   293.7

           92 105 211 292 511 551
Frequency  1   1   1   1   1   1
%         17  17  17  17  17  17

 > ?lung

So status==1 is a censored case and the observed times are status==2
 > describe(lung[lung$status==2&lung$ph.ecog==2,"time"])
lung[lung$status == 2 & lung$ph.ecog == 2, "time"]
       n missing  unique    Mean     .05     .10     .25     .50     . 
75     .90     .95
      44       1      44   226.0   14.95   36.90   94.50  178.50   
295.75  500.00  635.85

lowest :  11  12  13  26  30, highest: 524 533 654 707 814

And the mean time to death (in a group that had only 6 censored  
individual at times from 92 to 551)  was 226 and median time to death  
among 44 individuals is 178 with a right skewed distribution. You need  
to decide whether you want to make that particular prediction when you  
know that you forced a specific distributional form on the regression  
machinery by accepting the default.


>
> lfit <- survreg(Surv(time, status) ~ ph.ecog, data=lung)
> ptime <- predict(lfit, newdata=data.frame(ph.ecog=2), type='response')
>
>
>
> On Thu, Nov 11, 2010 at 5:26 PM, James C. Whanger
> <james.whanger at gmail.com>wrote:
>
>> Michael,
>>
>> You are looking to compute an estimated time to death -- rather  
>> than the
>> odds of death conditional upon time.  Thus, you will want to use  
>> "time to
>> death" as your dependent variable rather than a dichotomous outcome (
>> 0=alive, 1=death).   You can accomplish this with a straight forward
>> regression analysis.
>>
>> Best,
>>
>> Jim
>>
>> On Thu, Nov 11, 2010 at 3:44 AM, Michael Haenlein <haenlein at escpeurope.eu 
>> >wrote:
>>
>>> Dear all,
>>>
>>> I'm struggling with predicting "expected time until death" for a  
>>> coxph and
>>> survreg model.
>>>
>>> I have two datasets. Dataset 1 includes a certain number of people  
>>> for
>>> which
>>> I know a vector of covariates (age, gender, etc.) and their event  
>>> times
>>> (i.e., I know whether they have died and when if death occurred  
>>> prior to
>>> the
>>> end of the observation period). Dataset 2 includes another set of  
>>> people
>>> for
>>> which I only have the covariate vector. I would like to use  
>>> Dataset 1 to
>>> calibrate either a coxph or survreg model and then use this model to
>>> determine an "expected time until death" for the individuals in  
>>> Dataset 2.
>>> For example, I would like to know when a person in Dataset 2 will  
>>> die,
>>> given
>>> his/ her age and gender.
>>>
>>> I checked predict.coxph and predict.survreg as well as the  
>>> document "A
>>> Package for Survival Analysis in S" written by Terry M. Therneau  
>>> but I
>>> have
>>> to admit that I'm a bit lost here.
>>>
>>> Could anyone give me some advice on how this could be done?
>>>
>>> Thanks very much in advance,
>>>
>>> Michael
>>>
>>>
>>>
>>> Michael Haenlein
>>> Professor of Marketing


David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list