[R] CoxPH multivariate frailty model with coxph (survival)

Thu Aug 8 16:23:29 CEST 2019

 Chris,

Thanks a lot for the help and your investigation by simulations.

> Interacting a fixed effect with a strata variable is not uncommon.  No
> main effect of the strata variable is >  possible but the interaction term
> allows the fixed effect variable to have different values in different
>  strata.  I've more often seen it coded as a direct interaction:
> coxph( Surv(Time, Status) ~ treatment*strata(Event_type) + frailty(ID),
> data=example)
> (e.g., page 47 of Therneau and Grambsch)

This formulation works but doesn't provide the beta estimate of Tx on the
second event type (we need to sum the two beta). An equivalent formulation
that provides beta estimates on the two types of events is to create
intermediate variables as proposed page 177 of Therneau and Grambsch and
attached here.

<https://r.789695.n4.nabble.com/file/t388337/example.png>

> I don't see anything inherently wrong with your interpretation of the
> model.  While it seems that the
> assumption of no effect of one event on the other is very strong, I don't
> know the context of your
> analysis. I visualized it as a 4-state model.  Everyone starts in 0
> ("neither event").

To clarify the context of this analysis, we are interested to estimate
(&test) the effect of a genetic variant on two time-to-complication traits
(non-competitive based on expert's knowledge), while accounting for
potential unexplained dependency between the traits. Multi-state models can
effectively be an interesting
alternative model formulation that I need to think a bit more in this
context.

> I did not observe much gain from the frailty term (unmeasured covariate)
> with only 2 events in the short > simulation I tried (except when the
> effect was very strong and I got convergence warnings).

I modified your simulation script to plot the beta estimates from r=100 data
replicates obtained from the following model comparisons:
modf  : Cox PH frailty model (model of interest)
modc  : Cox PH including the simulated gamma variable (for comparison with
modf )
modcf : Cox PH frailty model  + gamma (for comparisons with modf & modc; the
introduction of the simulated gamma variable should reduced the variance of
the frailty term; )
modsep: separate Cox PH model for each time-to-event trait (no frailty, no
gamma)

It Overall, I got some expected results:

-modf reduces the bias of Tx effect on event 1 and event2 compared to modsep
(see boxplots)
-modsep show an increased bias, while not for modf when I increase the
variance of the simulated gamma term (see boxplots)
-when the gamma simulated covariate is introduced as a covariate in the
model (modc), the variance of the frailty term is reduced (modcf versus modf
results; see for the last replicates).

So, it validates some conclusions I had before, but :
1) modf returns an estimate of the variance of the frailty term that tends
to be lower than the one specified for the data simulation
2) To be sure the specified model fits exactly what I want, I need to check
the likelihood function especially because they use an equivalence with a
penalized likelihood for the gamma frailty (see help of frailty()).

Regarding the model I specified for the two time-to-event traits, and the
conditional assumption I assume (the two time-to-event traits are assumed
independent given the frailty term),
the contribution of each subject i to the likelihood of the model of
interest should have the form:
 Li = Li_ev1(beta1|ui) * Li_ev2(beta2|ui) * p(ui|zz),
with ui = subject's frailty term and ui~gamma(shape,scale) & p the gamma
distribution
Li_ev1 and Li_ev2 are the contribution of i to the likelihood for event1 &
event2

There is a description of the frailty model in Therneau's  et al paper:
 https://www.mayo.edu/research/documents/frailtypdf/doc-10027273 starting
page 15, detailed page 38.
The formulation is described in a different context, with N individuals
(1<=i<=N) in q independent clusters j (1<=j<=q) with a frailty term that
accounts for the dependence between individuals within each cluster.
In my case, each cluster corresponds to an individual i and dependencies
between the two observations for each subject are accounted by the frailty
term. Using their notations, j= individual and i= type of event.
I still need to clarify if their likelihood covers the likelihood of my
model formulation.

Any comments, suggestions or experiences are welcome. :)

If some have other suggestions of packages to fit this type of model,
please let me know (other potential R packages: parfm, frailtySurv, cph from
rms, mets, MST,..?).

Denise,

I attached below the extension of Chris's code.

#######################
# Modified Chris's code
#######################
library(flexsurv)
library(survival)

set.seed(20190729)

res <- NULL
for(r in 1:100) {
# Multiple non-competing outcomes, connected only by frailty (unmeasured
covariate)
nn <- 1000
kk <- 2

# frailty, 1 per individual
# variance of random effect
#tau <- 0.01^2
#tau <- 2^2
#tau <- 0.4^2
tau <- 0.8^2
#tau <- 1^2
#tau <- 0.6^2
gamma <- rgamma(nn, shape=1/tau, scale=tau) # mean is 1
hist(gamma)
sd(gamma) # ~ tau

# covariate
tx <- sample(0:1, nn, replace = TRUE)
# covariate effect on log hazard
beta <- 1
# might want to allow different treatment effects for different events
beta <- seq(kk)/kk

short <- data.frame(id=seq(nn), tx=tx, gamma=gamma)

# survival times, kk per individual
# tt <- rweibullPH(kk*nn, shape=2, scale=exp(beta*tx)*gamma)
# hist(tt)
# might want to allow different shapes or scales for different events
for (k in seq(kk)) {
        short[[paste0("time", k)]] <- rweibullPH(nn, shape=2,
scale=exp(beta[k]*tx)*gamma)
}
# might want to allow censoring

long <- reshape(short, direction="long", varying = paste0("time", seq(kk)),
v.names="times", timevar="event", idvar="id", times=seq(kk))
long <- long[order(long$id, long$event),]
mod <- coxph(Surv(times) ~ tx * strata(event) + frailty(id), data=long)
summary(mod)

mod0 <- coxph(Surv(times) ~ tx * strata(event), data=long)
coef(mod0)

mod1 <- coxph(Surv(times) ~ tx, data=long, subset=event==1)
coef(mod1)

mod2 <- coxph(Surv(times) ~ tx, data=long, subset=event==2)
coef(mod2)

# Transform long data & create contrasts terms
# create tx_ev1 & tx_ev2 to estimate effects of TX on event 1 & on event 2
long2 <- long
long2$event <- long$event-1
long2$tx_ev1 <- long2$tx*(1-long2$event)
long2$tx_ev2 <- long2$tx*(long2$event)

# mod1 corresponds to separate analysis of each time-to-event trait
modsep <- coxph(Surv(times)~tx_ev1 + tx_ev2 + strata(event), data=long2)
coef(modsep) # should match exactly Chris's results for mod1 & mod2
coef(mod1) # for verification
coef(mod2) # for verification

# Frailty model for time-to-event1 & time-to-event2 (frailty term captures
unexplained dependency)
modf <- coxph(Surv(times)~ tx_ev1 + tx_ev2  + strata(event) + frailty(id,
distribution="gamma"), data=long2)
coef(modf) # should match results of coef(mod)

#### for comparisons
# Corresponds to full model (includes simulated gamma as covariate, for
comparisons)
modc <- coxph(Surv(times)~tx_ev1 + tx_ev2 + gamma + strata(event) ,
data=long2) # corresponds to "true" model
coef(modc)

# Includes frailty term + gamma as a covariate (for comparisons)
modcf <- coxph(Surv(times)~tx_ev1 + tx_ev2 + gamma + strata(event) +
frailty(id, distribution="gamma") , data=long2) # corresponds to "true"
model
coef(modcf)

res <- rbind(res,list(coef(modc), coef(modf),coef(modcf),coef(modsep)))
}

pdf(paste("Boxplot.beta.event1.event2.tau",tau,".pdf",sep=""))
par(mfrow=c(2,4))
#boxplot for beta of Tx on event1
beta.tx.ev1.TM <- unlist(lapply(res[,1],function(x) {x[1] }))
boxplot(beta.tx.ev1.TM, main="Complete Model",ylab="beta of TX on event
1",col="grey")
abline(h=beta[1],col="red")
mean(beta.tx.ev1.TM)- beta[1] #bias

beta.tx.ev1.FM <-unlist(lapply(res[,2],function(x) {x[1] }))
boxplot(beta.tx.ev1.FM, main="Frailty Model",ylab="beta of TX on event
1",col="grey")
abline(h=beta[1],col="red")
mean(beta.tx.ev1.FM)- beta[1] #bias

beta.tx.ev1.FGM <-unlist(lapply(res[,3],function(x) {x[1] }))
boxplot(beta.tx.ev1.FGM, main="Frailty Model with sim. gamma
covariate",ylab="beta of TX on event 1",col="grey")
abline(h=beta[1],col="red")
mean(beta.tx.ev1.FGM)- beta[1] #bias

beta.tx.ev1.SM <- unlist(lapply(res[,4],function(x) {x[1] }))
boxplot(beta.tx.ev1.SM,main="Separate Models",ylab="beta of TX on event
1",col="grey")
abline(h=beta[1],col="red")
mean(beta.tx.ev1.SM)- beta[1] #bias

#beta of TX on event 2
beta.tx.ev2.TM <- unlist(lapply(res[,1],function(x) {x[2] }))
boxplot(beta.tx.ev2.TM,main="Complete Model",ylab="beta of TX on event
2",col="grey")
abline(h=beta[2],col="red")
mean(beta.tx.ev2.TM)- beta[2] #bias

beta.tx.ev2.FM <-unlist(lapply(res[,2],function(x) {x[2] }))
boxplot(beta.tx.ev2.FM,main="Frailty Model",ylab="beta of TX on event
2",col="grey")
abline(h=beta[2],col="red")
mean(beta.tx.ev2.FM)- beta[2] #bias

beta.tx.ev2.FGM <-unlist(lapply(res[,3],function(x) {x[2] }))
boxplot(beta.tx.ev2.FGM,main="Frailty Model with sim. gamma
covariate",ylab="beta of TX on event 2",col="grey")
abline(h=beta[2],col="red")
mean(beta.tx.ev2.FGM)- beta[2] #bias

beta.tx.ev2.SM <- unlist(lapply(res[,3],function(x) {x[2] }))
boxplot(beta.tx.ev2.SM,main="Separate Models",ylab="beta of TX on event
2",col="grey")
abline(h=beta[2],col="red")
mean(beta.tx.ev2.SM)- beta[2] #bias
dev.off()

# Print summary for the last data replicate
summary(modc)
summary(modf)
summary(modcf)
summary(modsep)

Le lun. 29 juil. 2019 à 12:00, Andrews, Chris <chrisaa using med.umich.edu> a
écrit :

> Denise,
>
> Interacting a fixed effect with a strata variable is not uncommon.  No
> main effect of the strata variable is possible but the interaction term
> allows the fixed effect variable to have different values in different
> strata.  I've more often seen it coded as a direct interaction:
>
> coxph( Surv(Time, Status) ~ treatment*strata(Event_type) + frailty(ID),
> data=example)
>
> (e.g., page 47 of Therneau and Grambsch)
>
> I don't see anything inherently wrong with your interpretation of the
> model.  While it seems that the assumption of no effect of one event on the
> other is very strong, I don't know the context of your analysis. I
> visualized it as a 4-state model.  Everyone starts in 0 ("neither event").
> There are hazards of moving from state 0 to state 1 (lambda01(t)) and from
> state 0 to state 2 (lambda02(t)).  The last state is "both events".  There
> are hazards of moving from state 1 to state 3 (lambda13(t), which you are
> assuming is identical to lambda02(t)) and from state 2 to state 3
> (lambda23(t), which you are assuming is identical to lambda01(t)).
>
> I did not observe much gain from the frailty term (unmeasured covariate)
> with only 2 events in the short simulation I tried (except when the effect
> was very strong and I got convergence warnings).
>
> Chris
>
>
> library("flexsurv")
>
> set.seed(20190729)
>
> # Multiple non-competing outcomes, connected only by frailty (unmeasured
> covariate)
>
> nn <- 1000
> kk <- 2
>
> # frailty, 1 per individual
> # variance of random effect
> #tau <- 0.01^2
> tau <- 0.1^2
> #tau <- 0.4^2
> #tau <- 1^2
> gamma <- rgamma(nn, shape=1/tau, scale=tau) # mean is 1
> hist(gamma)
> sd(gamma) # ~ tau
>
> # covariate
> tx <- sample(0:1, nn, replace = TRUE)
> # covariate effect on log hazard
> beta <- 1
> # might want to allow different treatment effects for different events
> beta <- seq(kk)/kk
>
> short <- data.frame(id=seq(nn), tx=tx, gamma=gamma)
>
> # survival times, kk per individual
> # tt <- rweibullPH(kk*nn, shape=2, scale=exp(beta*tx)*gamma)
> # hist(tt)
> # might want to allow different shapes or scales for different events
> for (k in seq(kk)) {
>         short[[paste0("time", k)]] <- rweibullPH(nn, shape=2,
> scale=exp(beta[k]*tx)*gamma)
> }
> # might want to allow censoring
>
> long <- reshape(short, direction="long", varying = paste0("time",
> seq(kk)), v.names="times", timevar="event", idvar="id", times=seq(kk))
> long <- long[order(long$id, long$event),]
>
> mod <- coxph(Surv(times) ~ tx * strata(event) + frailty(id), data=long)
> #summary(mod)
>
> mod0 <- coxph(Surv(times) ~ tx * strata(event), data=long)
> #summary(mod0)
>
> mod1 <- coxph(Surv(times) ~ tx, data=long, subset=event==1)
> #summary(mod1)
>
> mod2 <- coxph(Surv(times) ~ tx, data=long, subset=event==2)
> #summary(mod2)
>
> coef(mod)
> coef(mod0)
> coef(mod1)
> coef(mod2) - coef(mod1)
>
> coef(summary(mod))
> coef(summary(mod0))
> coef(summary(mod1))
>
>
>
> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius using comcast.net]
> Sent: Sunday, July 28, 2019 2:03 AM
> To: r-help using r-project.org
> Subject: Re: [R] CoxPH multivariate frailty model with coxph (survival)
>
>
> On 7/19/19 10:19 AM, Denise b wrote:
> >   Dear R users,
> >
> > I am interested in estimating the effects of a treatment on two
> > time-to-event traits (on simulated data), accounting for the dependency
> > between the two time-to-event outcomes.
> >
> > I precise that the events are NOT recurrent, NOT competitive, NOT
> ordered.
> > The individuals are NOT related and can have 0, 1 or the 2 events (in any
> > ordered).
> >
> > So, I specified a time-to-event model with one Cox PH hazard function for
> > each outcome.
> > The two hazard functions are linked by a common subject-specific frailty
> > term (gamma distribution) to account for the dependency. The likelihood
> > function of that model would include two risks sets (1 for eachoutcome)
> > connected via the shared frailty term.
> >
> > To fit that model, I used the coxph function (survival R package, T.
> > Thernaud):
> > coxph( Surv(Time, Status) ~ treatment*Event_type + strata(Event_type) +
> > frailty(ID), data=example)
> If you have Event_type as a strata variable, it seems problematic that
> it be also interacting as a fixed effect.
> > - where example is my dataset, with 2*N individuals (2 rows for each
> > individual, corresponding to each time-to-event outcome)
> > - Time = c(Time_outcome1, Time_outcome2)
> How exactly do you expect the Surv-function to handle such a vector?
> > - Status=c(Event_outcome1, Event_outcome2)
>
> The help page for Surv() says:  "For multiple endpoint data the event
> variable will be a factor, whose first level is treated as censoring."
>
> So it, too, would not be "expecting" a two-item vector. Seems that the
> multi-state formulation would make more sense.
>
> > - Event_type=c(rep(0,length(Time_outcome1)),rep(1,length(Time_outcome2))
>
>
> I'm reasonably sure 0 would be interpreted as a censored observation.
>
> > - ID=c(ID,ID)
> >
> > So, the model implies different baseline hazard function for each outcome
> > (argument strata) and estimate a treatment effect for each event type(
> > treatment*Event_type).
>
>
> I doubt it. I'm pretty sure there would be a conflation of effects and
> stratification.
>
> >
> > Although the model returns some estimates close to what I expected
> > (simulation study), the problem is that most of the examples I found with
> > this type of formulation are for competitive time-to-event models (as
> > presented in Therneau's and Grambush's book "Modeling survival
> > data"  (pages 240-260 and 175-185) and also in some info I found on
>
>
> I cannot lay my hands on my copy of that text, but I'm fairly sure it
> never mixed strata(.) and fixed effects for a variable in the same model.
>
>
> I suppose this is where I should admit  that I'm trained as an
> epidemiologist and not as a statistician, but I've done a fair amount of
> work with these models. Seems to me that your code are sufficiently out
> of the mainstream practice that you would at least want to create a
> simulated data-set and see if this results in agreement with assumptions.
>
>
> --
>
> David.
>
> > internet).
> >
> > So, I am wondering if some of you have the same or different
> interpretation
> > of mime.
> > Otherwise do you have other functions to recommend me to fit the desired
> > model, such as coxme...? I also checked the package " mets" which also
> > describes several examples for competitive time-to-event traits, but not
> > for NON competitive events.
> >
> > Thanks in advance for your help,
> > Denise
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not
> be used for urgent or sensitive issues
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]