[R] Teasing out logrank differences between groups using survdiff or something else?

Wed Sep 16 05:43:44 CEST 2009

Thomas, thanks for your comments.  We weren't entirely sure we we even
framing the question right, your comments are encouraging.  Here are our
results:

Call:
coxph(formula = Surv(lifespan, status) ~ group, data = four)

  n= 573 

              coef exp(coef) se(coef)      z Pr(>|z|)
groupT1     4.1371   62.6224   0.2472 16.734  < 2e-16 ***
groupT1.U1  3.8921   49.0122   0.2367 16.442  < 2e-16 ***
groupU1    -0.5177    0.5959   0.1232 -4.201 2.65e-05 ***
---
Signif. codes:  0 Œ***¹ 0.001 Œ**¹ 0.01 Œ*¹ 0.05 Œ.¹ 0.1 Œ ¹ 1

           exp(coef) exp(-coef) lower .95 upper .95
groupT1      62.6224    0.01597    38.574  101.6637
groupT1.U1   49.0122    0.02040    30.819   77.9462
groupU1       0.5959    1.67824     0.468    0.7586

Rsquare= 0.697   (max possible= 1 )
Likelihood ratio test= 683.6  on 3 df,   p=0
Wald test            = 348.9  on 3 df,   p=0
Score (logrank) test = 646.6  on 3 df,   p=0

Which shows that there are huge differences in our treatments.  Here's the
survdiff output on the object created by the call above:

              N Observed Expected (O-E)^2/E (O-E)^2/V
group=WISO  145      145    213.2      21.8      39.2
group=T1    152      152     52.9     185.5     248.1
group=T1.U1 144      144     52.1     162.0     209.3
group=U1    132      132    254.8      59.2     130.7

 Chisq= 618  on 3 degrees of freedom, p= 0

To make sure I understand, the null hypothesis here is that these all have
the same survival and censoring functions, and we have shown here that they
do not.

But, we are particularly interesting in comparing the differential effect of
treatments (these are actually genes inserted into Drosophila that are
generally toxic to various degrees).  What's the proper way to show/prove
that:

T1.U1 compared to U1 is more hazardous than T1 vs WISO

If in fact it is true?  Maybe the answer is already in our output, in the
sense that the CI's don't overlap much?  Maybe we are wrong to seek a p
value as well?

Thanks again, Bryan

*************
Bryan Hanson
Professor of Chemistry & Biochemistry
DePauw University, Greencastle IN USA

On 9/15/09 10:43 PM, "Thomas Lumley" <tlumley at u.washington.edu> wrote:

> 
> I think you do in fact want to just run the analysis for the four groups you
> are interested in. The logrank chisquared test would then be of the hypothesis
> that these four groups have the same survival and censoring distributions,
> with the greatest power for detecting proportional-hazards differences between
> the groups.
> 
> You are correct in noting that the results you get for comparing these four
> groups would change depending on what other groups are in the analysis. This
> is a seriously underappreciated property of rank-based analyses. However,
> because of this dependence I think you can make a good case that restricting
> the analysis to the groups of interest is the best way to run the test.
> 
>      -thomas
> 
> On Tue, 15 Sep 2009, Bryan Hanson wrote:
> 
>> R Folk:
>> 
>> Please forgive what I'm sure is a fairly naïve question; I hope it's clear.
>> A colleague and I have been doing a really simple one-off survival analysis,
>> but this is an area with which we are not very familiar, we just happen to
>> have gathered some data that needs this type of analysis.  We've done quite
>> a bit of reading, but answers escape us, even though the question below
>> seems simple.
>> 
>> Considering the following example from ?survdiff:
>> 
>>> survdiff(Surv(time, status) ~ pat.karno, data=lung)
>> Call:
>> survdiff(formula = Surv(time, status) ~ pat.karno, data = lung)
>> 
>> n=225, 3 observations deleted due to missingness.
>> 
>>               N Observed Expected (O-E)^2/E (O-E)^2/V
>> pat.karno=30   2        1    0.658    0.1774     0.179
>> pat.karno=40   2        1    1.337    0.0847     0.086
>> pat.karno=50   4        4    1.079    7.9088     8.013
>> pat.karno=60  30       27   15.237    9.0808    10.148
>> pat.karno=70  41       31   26.264    0.8540     1.027
>> pat.karno=80  51       39   40.881    0.0865     0.117
>> pat.karno=90  60       38   49.411    2.6354     3.853
>> pat.karno=100 35       21   27.133    1.3863     1.684
>> 
>> Chisq= 22.6  on 7 degrees of freedom, p= 0.00202
>> 
>> The p value here is for the entire group (right?).  How do we go about
>> determining the p value for the comparison of any four arbitrary groups in
>> all combinations, say pat.karno = 40, 60, 80, and 100?
>> 
>> We know (we think) that we can't just run the coxph analysis for the only
>> the groups of interest, as the hazard ratio for any one group in an analysis
>> with several groups is computed by holding the other groups at their average
>> value, so the hazard ratio varies by the context.
>> 
>> Seems like we need some sort of t-test or chi-squared test, but being mere
>> chemists and molecular biologists, we don't quite see it and wouldn't trust
>> ourselves anyway, given the special nature of survival analysis.  Manual
>> instructions or a function suggestion would be great.
>> 
>> Thanks in Advance, Bryan
>> *************
>> Bryan Hanson
>> Professor of Chemistry & Biochemistry
>> DePauw University, Greencastle IN USA
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> Thomas Lumley   Assoc. Professor, Biostatistics
> tlumley at u.washington.edu University of Washington, Seattle
> 
> 

[R] Teasing out logrank differences *between* groups using survdiff or something else?

[R] Teasing out logrank differences between groups using survdiff or something else?