[R] Survival Rate Estimates

David Winsemius dwinsemius at comcast.net
Sat May 14 19:08:25 CEST 2011


On May 13, 2011, at 8:49 AM, Terry Therneau wrote:

> ---begin included message --
> Is there an automated way to use the survival package to generate
> survival
> rate estimates and their standard errors?  To be clear, *not *the
> survivorship estimates (which are cumulative), but the survival  
> *rate *
> estimates...
> --- end --
>
> The classic method in epidemiology is simple hazard rates = total
> events / total time.  This is done in more general form by the pyears
> (person-years) routine.
>
>  pfit <- pyears(Surv(time, status) ~ agegrp + sex, data=mydata)
>
> It's advantage over simple tables is the ability to do time-dependent
> categories.  For instance if I want death rates by decade of age:
>  acut <- tcut(mydata$age, 0:12 *10)
>  pfit2 <- pyears(time, status) ~ acut + sex, data=mydata)

I think there was an omitted Surv function call above.

> The first table was based on age at the start, the second on current
> age.

I didn't appreciate that comment in my first reading.  Thnaks to Brian  
for asking and Terry for authorship. That will make a big difference  
for my work. So now the first question is: How does 'acut' get handled  
internally? It looks like a factor, but I would have guessed that in  
order to deal with decade-crossing survival "non-events", that it  
would need to create some multiple records. It doesn't seem to do  
this, so perhaps something like that happen in the pyears function.

And a follow on question and a tentative answer: Can one generate a  
'pyears' table that incorporates categories of time observed by tcut()- 
ting the time variable and having it in both the Surv argument and on  
the RHS of the formula? Experimentation makes me think this works very  
well.

ddf <- data.frame(id = 1:100, surv.year=runif(100, 0,10),  
death=sample(c(0,1), 100, prob=c(0.7, 0.3), replace=TRUE) )

  pfit2$event/pfit2$pyears
#----
0+ thru  2 2+ thru  4 4+ thru  6 6+ thru  8 8+ thru 10
105.506642  80.474685  16.271176  20.165972   7.896898
>
>
> For non-parametric estimates of the hazard function look at the
> "survival" task view on CRAN; there are several choices.
>
> Terry Therneau
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list