[R] coxme in R underestimates variance of random effect, when random effect is on observation level

Wed Mar 28 19:27:16 CEST 2018

Hello,

I have a question concerning fitting a cox model with a random intercept, also known as a frailty model. I am using both the coxme package, and the frailty statement in coxph. Often 'shared' frailty models are implemented in practice, to group people who are from a cluster to account for homogeneity in outcomes for people from the same cluster. I am more interested in the classic frailty model, 'Early frailty models incorporated subject-specific random effects to account for unmeasured subject characteristics that influenced the hazard of the occurrence of the outcome'. This is because I have data where I would like to estimate the amount of variation between patients risks that has not been accounted for by adjusting for the variables that I have.

I initially ran some simulations to check that I could estimate the variance of such random effects consistently with no bias. When the random effect is applied on the group level (shared frailty model), the variance of the random effect can be calculated. When the random effect is on an observation level, it is consistently underestimated. This happens when using both the coxme<https://cran.r-project.org/web/packages/coxme/coxme.pdf> package, and the and frailty<https://www.rdocumentation.org/packages/survival/versions/2.11-4/topics/frailty> statement in coxph.

Questions:

1) Why is variance of the random effect underestimating when the random effect is on an individual level, is this a mathematical problem or a coding issue?

2) Is there any literature/R packages that look specifically at random effects applied on the observation level?

Many thanks for any solutions/or references which may be helpful.

Reproducible example here (using coxme):
setwd("/mnt/ja01-home01/mbrxsap3/phd_risk/R/p4_run_analysis/")

  library(coxme)
  library(survival)

  ### Create data with a group level random effect
  simulWeib.group <- function(N, lambda, rho, beta1, beta2, beta3, beta4, rateC, sigma, M)
  {
    # covariate --> N Bernoulli trials
    x1 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
    x2 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
    x3 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
    x4 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))

    # Now create random effect stuff
    # Create vector of groups
    re.group <- sample(x=1:M, size=N, replace=TRUE, prob=rep(1/M, M))

    # Create vector of r.e effects
    re.effect <- rnorm(M,0,sigma)

    # Now create a vector which assigns the re.effect depending on the group
    re.group.effect <- re.effect[re.group]

    # Weibull latent event times
    v <- runif(n=N)
    Tlat <- round((- log(v) / (lambda * exp(x1 * beta1 + x2 * beta2 + x3 * beta3 + x4 * beta4 + re.group.effect)))^(1 / rho))

    # censoring times
    #C <-rep(100000,N)
    C <- rexp(n=N, rate=rateC)

    # follow-up times and event indicators
    time <- pmin(Tlat, C)
    #status <- as.numeric(rep(1,N))
    status <- as.numeric(Tlat <= C)

    # data set
    data.frame(id=1:N,
               time=time,
               status=status,
              x1 = x1,
               x2 = x2,
               x3 = x3,
               x4 = x4,
               group=re.group)
  }

  ### Create data with an individual level random effect
  simulWeib <- function(N, lambda, rho, beta1, beta2, beta3, beta4, rateC, sigma)
  {
    # covariate --> N Bernoulli trials
    x1 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
    x2 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
    x3 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))
    x4 <- sample(x=c(0, 1), size=N, replace=TRUE, prob=c(0.5, 0.5))

    # Now create random effect stuff
    # Create one vector of length N, all drawn from same normal distribution
    rand.effect <- rnorm(N,0,sigma)

    # Weibull latent event times
    v <- runif(n=N)
    Tlat <- round((- log(v) / (lambda * exp(x1 * beta1 + x2 * beta2 + x3 * beta3 + x4 * beta4 + rand.effect)))^(1 / rho))

    # censoring times
    #C <-rep(100000,N)
    C <- rexp(n=N, rate=rateC)

    # follow-up times and event indicators
    time <- pmin(Tlat, C)
    #status <- as.numeric(rep(1,N))
    status <- as.numeric(Tlat <= C)

    # data set
    data.frame(id=1:N,
               time=time,
               status=status,
               x1 = x1,
               x2 = x2,
               x3 = x3,
               x4 = x4)
  }

  set.seed(101)

  ### Individual random effects (frailty).
  sd.2<-rep(0,50)
  for (i in 1:50){
    data2<-simulWeib(25000,lambda=0.0001,rho=2,beta1=0.33,beta2=5,beta3=0.25,beta4=0,rateC=0.0000000001, sigma = 0.25)
    data2$id<-as.factor(data2$id)
    fit.cox2<-coxme(Surv(time,status) ~ x1 + x2 + x3 + (1 | id), data=data2)
    sd.2[i]<-sqrt(as.numeric(fit.cox2$vcoef))
    print(i)
  }

  print("model 2 done")

  ### Same as previous example, but patients are grouped

  sd.10<-rep(0,50)
  for (i in 1:50){
    data10<-simulWeib.group(25000,lambda=0.0001,rho=2,beta1=0.33,beta2=5,beta3=0.25,beta4=0,rateC=0.0000000001, sigma = 0.25, M=40)
    data10$group<-as.factor(data10$group)
    fit.cox10<-coxme(Surv(time,status) ~ x1 + x2 + x3 + (1 | group), data=data10)
    sd.10[i]<-sqrt(as.numeric(fit.cox10$vcoef))
    print(i)
  }

  print("model 10 done")

PhD student
Health e-Research Centre � Farr Institute
Rm 2.006 � Vaughan House � Portsmouth Street � University of Manchester � M13 9GB

	[[alternative HTML version deleted]]