[R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph

vikkiyft s067835 at alumni.cuhk.net
Thu Feb 24 18:29:47 CET 2011


Dear Prof Frank,

I tried to simulate an example data set as close as possible to my own real
data with the codes below. There are only two covariates, tumor(3 levels)
and ecog(3 levels). "rx" is treatment (4 levels). Validation with the
stratified model (by rx) had a negative R2.. and the R2 under the training
column was so high...? 

n<-1000
set.seed(110222)
data<-matrix(rep(0,5000),ncol=5)
data[,1]<-sample(1:3,n,rep=TRUE,prob=c(.32, .30, .38))
for (i in 1:1000){
if (data[i,1]==1) data[i,2]<-sample(1:3,1,prob=c(.76, .18, .06))
if (data[i,1]==2) data[i,2]<-sample(1:3,1,prob=c(.67, .24, .09))
if (data[i,1]==3) data[i,2]<-sample(1:3,1,prob=c(.47, .37, .16))}
for (i in 1:1000){
if (data[i,1]==1) data[i,3]<-sample(1:4,1,prob=c(.70, .19, .03, .08))
if (data[i,1]==2) data[i,3]<-sample(1:4,1,prob=c(.42, .28, .12, .18))
if (data[i,1]==3) data[i,3]<-sample(1:4,1,prob=c(.11, .29, .30, .30))}
for (i in 1:1000){
if (data[i,3]==1)
data[i,4]<-12*rgamma(1000,rate=0.4,shape=1.7)[c(sample(26:975,1,prob=c(rep(1/950,950))))]
if (data[i,3]==2)
data[i,4]<-12*rgamma(1000,rate=0.9,shape=1.7)[c(sample(26:975,1,prob=c(rep(1/950,950))))]
if (data[i,3]==3)
data[i,4]<-12*rgamma(1000,rate=1.2,shape=0.6)[c(sample(26:975,1,prob=c(rep(1/950,950))))]
if (data[i,3]==4)
data[i,4]<-12*rgamma(1000,rate=1.5,shape=0.7)[c(sample(26:975,1,prob=c(rep(1/950,950))))]}
for (i in 1:1000){
if (data[i,3]==1) data[i,5]<-sample(c(0,1),1,prob=c(.53, .47))
if (data[i,3]==2) data[i,5]<-sample(c(0,1),1,prob=c(.17, .83))
if (data[i,3]==3) data[i,5]<-sample(c(0,1),1,prob=c(.05, .95))
if (data[i,3]==4) data[i,5]<-sample(c(0,1),1,prob=c(.06, .94))}
colnames(data)<-c("tumor","ecog","rx","os","censor")
data<-data.frame(data)
attach(data)

library(rms)
surv.obj<-Surv(os,censor)
validate(cph(surv.obj~factor(tumor)+factor(ecog),x=T,y=T,surv=T),method="b",B=100,dxy=T)
validate(cph(surv.obj~factor(tumor)+factor(ecog)+strat(rx),
x=T,y=T,surv=T,time.inc=60),method="b",B=100,dxy=T,u=60)



Best regards,
Vikki
-- 
View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3323016.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list