[R] Discrepancy in the regression coefficients for Cox regression - PBC data set

Ravi Varadhan RVaradhan at jhmi.edu
Fri Nov 21 20:30:26 CET 2008


Peter,

I did check the data in the Appendix of F&H with the data in "survival"
package.  I couldn't find any differences in the "time" and "status"
variables.

May be Terry Therneau knows the answer?!

Ravi.
----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvaradhan at jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



----------------------------------------------------------------------------
--------


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Peter Dalgaard
Sent: Friday, November 21, 2008 1:58 PM
To: Ravi Varadhan
Cc: r-help at r-project.org
Subject: Re: [R] Discrepancy in the regression coefficients for Cox
regression - PBC data set

Ravi Varadhan wrote:
> Hi David,
> 
> I did look at Appendix D.3 of T&G, but am not sure if the data set 
> analyzed in F&H and that attached with "survival" are different.  They 
> both have
> n=418 (312 from RCT and 106 observational).  

Well, as David implies, if the observation times are longer and a few more
people died, that could easily explain the differences.

Someone borrowed our copy of F&H so I can't check, but presumably you have
one (and it is your problem anyway...).

> 
> There is a major difference in the coefficient for "edema" 0.66 vs 
> 0.86.  In any case, the point is not whether the differences in 
> coefficient affect interpretation of the model, but to understand why 
> there are differences in the results.
> 
> Best,
> Ravi.
> 
> 
> ----------------------------------------------------------------------
> ------
> -------
> 
> Ravi Varadhan, Ph.D.
> 
> Assistant Professor, The Center on Aging and Health
> 
> Division of Geriatric Medicine and Gerontology
> 
> Johns Hopkins University
> 
> Ph: (410) 502-2619
> 
> Fax: (410) 614-9625
> 
> Email: rvaradhan at jhmi.edu
> 
> Webpage:  
> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
> 
>  
> 
> ----------------------------------------------------------------------
> ------
> --------
> 
> 
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Friday, November 21, 2008 12:34 PM
> To: Ravi Varadhan
> Cc: r-help at r-project.org
> Subject: Re: [R] Discrepancy in the regression coefficients for Cox 
> regression - PBC data set
> 
> There is a discussion in Appendix D.3 of "Modeling Survival Data" by 
> Thereau and Grambsch regarding the differences in the datasets 
> including the fact that "there was significantly more follow-up for 
> many patients at the time this dataset was assembled". I do not see a 
> material difference in the estimates.
> 
> --
> David Winsemius, MD
> Heritage Labs
> 
> On Nov 21, 2008, at 12:16 PM, Ravi Varadhan wrote:
> 
>> Hi,
>>
>> When I run the following Cox proportional hazards model on the Mayo 
>> clinic's PBC data set (given in the "survival" package), the 
>> regression coefficients do not agree with the results presented in 
>> Table 4.6.3 (p. 195) of Fleming & Harrington's book.
>>
>> library(survival)
>>
>> data(pbc)
>>
>> ans.cox <- coxph(Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>
>> ans.cox
>>
>>> ans.cox <- coxph(Surv(time, status) ~ log(bili) + log(alb) + age +
>> log(protime) + edema)
>>> ans.cox
>> Call:
>> coxph(formula = Surv(time, status) ~ log(bili) + log(alb) + age +
>>    log(protime) + edema)
>>
>>
>>                coef exp(coef) se(coef)     z       p
>> log(bili)     0.8975     2.453  0.08271 10.85 0.0e+00
>> log(alb)     -2.4524     0.086  0.65707 -3.73 1.9e-04
>> age           0.0382     1.039  0.00768  4.97 6.5e-07
>> log(protime)  2.3458    10.442  0.77425  3.03 2.4e-03
>> edema         0.6613     1.937  0.20595  3.21 1.3e-03
>>
>> Likelihood ratio test=234  on 5 df, p=0  n= 418 These coefficients, 
>> however, are significantly different (i.e. the differences can't be 
>> just attributed to round-off's) from that reported in Table 4.6.3 (in 
>> the "Final model" column) of Fleming and Harrington (p.
>> 195).  The coefficients reported are: 0.8707, -2.533, 0.0394, 2.380, 
>> 0.8592.
>> Note the big difference for the "edema" variable.
>>
>> It seems like the data set considered in the book and that available 
>> in "survival" package are the same (with n=418).
>>
>> I also re-ran the Cox PH model with the 2 "data-errors" discussed in
>> p.188
>> of F&H, but still I could not match the results in Table 4.6.3.
>>
>> Is it possible that the results could be explained due to difference 
>> in convergence during maximization of partial likelihood?
>>
>> Can anyone help me figure out why this diescrepancy exists?
>>
>> Thanks very much,
>> Ravi.
>> ---------------------------------------------------------------------
>> -
>> ------
>> -------
>>
>> Ravi Varadhan, Ph.D.
>>
>> Assistant Professor, The Center on Aging and Health
>>
>> Division of Geriatric Medicine and Gerontology
>>
>> Johns Hopkins University
>>
>> Ph: (410) 502-2619
>>
>> Fax: (410) 614-9625
>>
>> Email: rvaradhan at jhmi.edu
>>
>> Webpage:  
>> http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>>
>>
>>
>> ---------------------------------------------------------------------
>> -
>> ------
>> --------
>>
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list