[R] (bug?) in survfit in R-2-15.0

Terry Therneau therneau at mayo.edu
Thu May 17 14:59:46 CEST 2012


The predict methods are designed to work with a dataframe.  In the case 
of survfit(cox model) the code has, for backwards compatability 
purposes, the ability to use a vector as the "newdata" argument.  This 
feature is not documented in the help file, on purpose, and is expected 
to "go away" one day.  This backwards compatatibilty works in all my 
test cases, but not in the one you present.  This may accelerate my 
plans for removal.

   The best solution for you is to use a dataframe for both the coxph 
fit and the predict call.  Avoid using objects that are in the global 
data or are attached.  Below I reprise your example in safer code.  The 
coxph/survfit calls both work from a temporary data frame called 
"dummy".  Your use of the coxph/survfit pair to get a survival curve for 
glmnet is clever, by the way.  I like it.

library(glmnet)
library(survival)
load(system.file("doc","VignetteExample.rdata",package="glmnet"))

fit <- with(patient.data, glmnet(x[-1,], Surv(time[-1], status[-1]),
                                  family="cox", alpha=.05, maxit=10000))
max.dev.index <- which.max(fit$dev.ratio)
optimal.beta <- fit$beta[, max.dev.index]
nonzero.coef <- (optimal.beta != 0)

dummy <- with(patient.data, data.frame(time=time, status=status, 
x=x[,nonzero.coef]))
coxfit <- coxph(Surv(time, status) ~ ., data=dummy, subset= -1,
             iter=0, init=optimal.beta[nonzero.coef])
sfit <- survfit(coxfit, newdata=dummy[1,])

Terry Therneau

---- begin included message -------

Dear all,

I am using glmnet + survival and due to the latest
release of glmnet 1.7.4 I was forced to use the latest version of R 2.15.0.
My previous version of R was 2.10.1. I changed glmnet version and R
version and when I started to get weird results I was not sure where the 
bug was.

After putting glmnet back to previous version, I have
found that the bug or weird behaviour happens in survival survfit. My 
script is
below in email and it is uses glmnet 1.7.3. I get two completely different
answers in different versions of R and in my opinion the older version of
survival package returns correct value.

in R 2-10-1 I get

?>
cat(dim(sfit$surv))

?

?>
cat(length(sfit$surv))

32

?>

?

in R-2-15-0 I get

cat(dim(sfit$surv))

62 99

?>
cat(length(sfit$surv))

6138

?>

?Kind regardsDK


library(survival)

library(glmnet)

load(system.file("doc","VignetteExample.rdata",package="glmnet"))

attach(patient.data)

trainX?????
<-x[-1,]

trainTime??
<-time[-1]

trainStatus <- status[-1]

fit <-

glmnet(trainX,Surv(trainTime,trainStatus),family="cox",alpha=0.5,maxit=10000)

max.dev.index????
<- which.max(fit$dev.ratio)

optimal.lambda <- fit$lambda[max.dev.index]
optimal.beta? <-
fit$beta[,max.dev.index] nonzero.coef <- abs(optimal.beta)>0 selectedBeta
<- optimal.beta[nonzero.coef]

selectedTrainX??
<- trainX[,nonzero.coef]

coxph.model<- coxph(Surv(trainTime,trainStatus)~

selectedTrainX,init=selectedBeta,iter=0)

selectedTestX <- x[1,nonzero.coef]

sfit<- survfit(coxph.model,newdata=selectedTestX)

cat("\ndim(sfit$surv)")

cat(dim(sfit$surv))

cat("\nlength(sfit$surv)")

cat(length(sfit$surv))



More information about the R-help mailing list