[R] How to specify "newdata" in a Cox-Modell with a time dependent interaction term?

John Fox jfox at mcmaster.ca
Sat Jun 16 15:55:29 CEST 2012


Dear Jürgen,

All the values of your Id variable are equal to 1; how can that make sense? Removing the argument Id=id may get you what you want.

By the way, the document to which you refer was an appendix to a 2002 book that has been superseded by Fox and Weisberg, An R Companion to Applied Regression, Second Edition (Sage, 2011). Appendices for that book are available at <http://socserv.socsci.mcmaster.ca/jfox/Books/Companion/appendix.html>.

I hope this helps,
 John

------------------------------------------------
John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
	
On Sat, 16 Jun 2012 15:04:48 +0200
 Jürgen Biedermann <juergen.biedermann at googlemail.com> wrote:
> Dear Mr. Therneau, Mr. Fox, or to whoever, who has some time...
> 
> I don't find a solution to use the "survfit" function (package: survival)  for a defined pattern of covariates with a Cox-Model including a time dependent interaction term. Somehow the definition of my "newdata" argument seems to be erroneous.
> I already googled the problem, found many persons having the same or a similar problem, but still no solution.
> I want to stress that my time-dependent covariate does not depend on the failure of an individual (in this case it doesn't seem sensible to predict a survivor function for an individual). Rather one of my effects declines with time (time-dependent coefficient).
> 
> For illustration, I use the example of John Fox's paper "Cox Proportional - Hazards Regression for Survival Data".
> http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
> 
> Do you know any help? See code below
> 
> Thanks very much in advance
> Jürgen Biedermann
> 
> #----------------------------------------
> #Code
> 
> Rossi <- read.table("http://cran.r-project.org/doc/contrib/Fox-Companion/Rossi.txt", header=T)
> 
> Rossi.2 <- fold(Rossi, time='week',
>      event='arrest', cov=11:62, cov.names='employed')
> 
> # see below for the fold function from John Fox
> 
> # modeling an interaction with time (Page 14)
> 
> mod.allison.5 <- coxph(Surv(start, stop, arrest.time) ~
>      fin + age + age:stop + prio,
>      data=Rossi.2)
> mod.allison.5
> 
> # Attempt to get the survivor function of a person with age=30, fin=0 and prio=5
> 
> newdata.1 <- data.frame(unique(Rossi.2[c("start","stop")]),fin=0,age=30,prio=5,Id=1,arrest.time=0)
> fit <- survfit(mod.allison.5,newdata.1,id="Id")
> 
> Error message:
> 
>  >Fehler in model.frame.default(data = newdata.1, id = "Id", formula = 
> Surv(start,  :
>    Variablenlängen sind unterschiedlich (gefunden für '(id)')
> 
> --> failure, length of variables are different.
> 
> #-----------------------------------------------------------------
> fold <- function(data, time, event, cov,
>      cov.names=paste('covariate', '.', 1:ncovs, sep=""),
>      suffix='.time', cov.times=0:ncov, common.times=TRUE, lag=0){
>      vlag <- function(x, lag) c(rep(NA, lag), x[1:(length(x)-lag)])
>      xlag <- function(x, lag) apply(as.matrix(x), 2, vlag, lag=lag)
>      all.cov <- unlist(cov)
>      if (!is.list(cov)) cov <- list(cov)
>      ncovs <- length(cov)
>      nrow <- nrow(data)
>      ncol <- ncol(data)
>      ncov <- length(cov[[1]])
>      nobs <- nrow*ncov
>      if (length(unique(c(sapply(cov, length), length(cov.times)-1))) > 1)
>          stop(paste(
>              "all elements of cov must be of the same length and \n",
>              "cov.times must have one more entry than each element of 
> cov."))
>      var.names <- names(data)
>      subjects <- rownames(data)
>      omit.cols <- if (!common.times) c(all.cov, cov.times) else all.cov
>      keep.cols <- (1:ncol)[-omit.cols]
>      nkeep <- length(keep.cols)
>      if (is.numeric(event)) event <- var.names[event]
>      times <- if (common.times) matrix(cov.times, nrow, ncov+1, byrow=T)
>          else data[, cov.times]
>      new.data <- matrix(Inf, nobs, 3 + ncovs + nkeep)
>      rownames <- rep("", nobs)
>      colnames(new.data) <- c('start', 'stop', paste(event, suffix, sep=""),
>          var.names[-omit.cols], cov.names)
>      end.row <- 0
>      for (i in 1:nrow){
>          start.row <- end.row + 1
>          end.row <- end.row + ncov
>          start <- times[i, 1:ncov]
>          stop <- times[i, 2:(ncov+1)]
>          event.time <- ifelse (stop == data[i, time] & data[i, event] == 
> 1, 1, 0)
>          keep <- matrix(unlist(data[i, -omit.cols]), ncov, nkeep, byrow=T)
>          select <- apply(matrix(!is.na(data[i, all.cov]), ncol=ncovs), 
> 1, all)
>          rows <- start.row:end.row
>          cov.mat <- xlag(matrix(unlist(data[i, all.cov]), 
> nrow=length(rows)), lag)
>          new.data[rows[select], ] <-
>              cbind(start, stop, event.time, keep, cov.mat)[select,]
>          rownames[rows] <- paste(subjects[i], '.', seq(along=rows), sep="")
>          }
>      row.names(new.data) <- rownames
>      as.data.frame(new.data[new.data[, 1] != Inf &
>          apply(as.matrix(!is.na(new.data[, cov.names])), 1, all), ])
>      }
> #-----------------------------------------------------------------
>



More information about the R-help mailing list