[R] Presenting Hazard ratios for interacting variables in a Cox model

David Winsemius dwinsemius at comcast.net
Fri Nov 18 20:21:04 CET 2016

> On Nov 18, 2016, at 6:56 AM, Stuart Patterson via R-help <r-help at r-project.org> wrote:
> I have a time-dependent cox model with three variables, each of which
> interacts with the other two. So my final model is:
> fit12<-coxph(formula = Surv(data$TimeIn, data$Timeout, data$Status) ~ data$
> Year+data$Life_Stg+data$prev.tb +data$prev.tb*data$Life_Stg + data$Year*data
> $Life_Stg + data$Year*data$prev.tb + frailty(data$Natal_Group), data = data)

It seems fairly likely that you are shooting yourself in the foot by using the `data$variate` inside the formula. It will prevent the regression result from having correctly assembled references to variables. And that will become evident when you try to do any predictions. Try instead:

fit12<-coxph(formula = Surv( TimeIn,  Timeout,  Status) ~  
            prev.tb * Life_Stg +  Year *Life_Stg +  Year * prev.tb + frailty( Natal_Group),      
            data = data)

The `*` in a formula automatically includes the lower order individual variates in the estimates. Your model RHS could have also been written (more clearly in my opinion):

         ~ (prev.tb + Life_Stg +  Year)^2

... since R formulas interpret the `(.)^N` operation as "all base effects and interactions up to order N".

> For my variables, there are 3 categories of year, three of year, and
> prev.tb is a binary variable. Because of the interactions, when I present
> the results, I want to present the Hazard ratio, 95% CI, and p value for
> each combination of the three variables. How do I get R to give me these
> values please?
> I think that the contrast function does this for other models but does not
> work for coxph?

The usual method would be to use `predict` on a 'newdata' dataframe with all the combinations generated from `expand.grid`. The combination of the reference values of all three variables should yield a 1.0 hazard ratio. But time-dependent model predictions need a complete specification of a sequence of values over the time course of the study (as well as specification of the frailty term. So I'm not in a position to comment on feasibility for this situation.

> Grateful for any suggestions
> Best wishes
> Stuart Patterson, Royal Veterinary College, University of London
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

More information about the R-help mailing list