[R] anova.coxph with subsets of data

Oscar Rueda Oscar.Rueda at cruk.cam.ac.uk
Wed Jan 29 10:15:58 CET 2014


Dear David, 
Thanks for your reply. 
I'll try to be more specific: why 

> library(survival)
> data(ovarian)
> fit <- coxph(Surv(futime, fustat) ~ resid.ds *rx + ecog.ps, data = ovarian, subset=ovarian$age>50)
> anova(fit)
> fit2 <- coxph(Surv(futime, fustat) ~ resid.ds +rx + ecog.ps, data=ovarian, subset=ovarian$age>50)
> anova(fit2,fit)

give different answers? (I assumed that both were sequential tests for the interaction?

while 

> sub.ovarian <- subset(ovarian, age>50)
> fit <- coxph(Surv(futime, fustat) ~ resid.ds *rx + ecog.ps, data =sub.ovarian)
> anova(fit)
> fit2 <- coxph(Surv(futime, fustat) ~ resid.ds +rx + ecog.ps, data=sub.ovarian)
> anova(fit2,fit)

give the same answer?

Thanks for your help, 
Oscar

Oscar M. Rueda, PhD.
 Postdoctoral Research Fellow, Caldas Lab, Breast Cancer Functional
 Genomics.
 University of Cambridge. Cancer Research UK Cambridge Institute.
 Li Ka Shing Centre, Robinson Way.
 Cambridge CB2 0RE
 England
________________________________________
From: David Winsemius [dwinsemius at comcast.net]
Sent: Wednesday, January 29, 2014 12:10 AM
To: Oscar Rueda
Cc: r-help at r-project.org
Subject: Re: [R] anova.coxph with subsets of data

On Jan 28, 2014, at 10:32 AM, Oscar Rueda wrote:

> Dear list,
> I'm using the rms package to fit some Cox models. I run anova() on them to obtain sequential p-values, but I'm getting strange results when I run it on a subset of the data.
>
> Following the example on the help page of anova.coxph:
>> library(rms)
>> data(ovarian)
>> fit <- coxph(Surv(futime, fustat) ~ resid.ds *rx + ecog.ps, data = ovarian)
>> anova(fit)
>> fit2 <- coxph(Surv(futime, fustat) ~ resid.ds +rx + ecog.ps, data=ovarian)
>> anova(fit2,fit)
>
> would give me the same result, as expected.
> But If I do
>
>> fit <- coxph(Surv(futime, fustat) ~ resid.ds *rx + ecog.ps, data = ovarian, subset=ovarian$age>50)
>> anova(fit)
>> fit2 <- coxph(Surv(futime, fustat) ~ resid.ds +rx + ecog.ps, data=ovarian, subset=ovarian$age>50)
>> anova(fit2,fit)
>
> The first p-value seems to be wrong.

Wrong ... in what way?

> Would anybody please explain to me why?

Perhaps because anova is a generic function and you were expecting anova.cph to be used but the coxph function is not from pkg:rms but rather from pkg:survival.

methods(anova)  # with both rms and survival loaded

>  methods(anova)
 [1] anova.coxmelist*   anova.coxph*       anova.coxphlist*   anova.glm
 [5] anova.glmlist      anova.glmmPQL*     anova.lm           anova.loess*
 [9] anova.loglm*       anova.mlm          anova.negbin*      anova.nls*
[13] anova.polr*        anova.rms*         anova.rq           anova.rqlist
[17] anova.survreg*     anova.survreglist*

>
> Cheers,
> Oscar
>
> PS. I'm using R 3.0.1.
>
> Oscar M. Rueda, PhD.
> Postdoctoral Research Fellow, Caldas Lab, Breast Cancer Functional
> Genomics.
> University of Cambridge. Cancer Research UK Cambridge Institute.
> Li Ka Shing Centre, Robinson Way.
> Cambridge CB2 0RE
> England
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA





More information about the R-help mailing list