[R] Choosing between functional forms using flexible parametric survival models

Bonnett, Laura L@J@Bonnett @ending from liverpool@@c@uk
Thu Sep 27 17:26:31 CEST 2018

Dear all,

I am using R 3.4.3 on Windows 10.  I am writing code to use in a forthcoming teaching session.  As part of the workshop the students are using breast cancer data made available by Patrick Royston and available from http://www.statapress.com/data/fpsaus.html (I didn't pick the dataset by the way).  I would like the students to visualise linear, fractional polynomial and spline transformations of the "node" variable using a flexible parametric model with 3 knots for the baseline hazard.  I can do this using the "predict" option within stpm2 as follows:

flex_nodes_lin <- stpm2(Surv(rfs/12,rfsi)~nodes, data=Practical_Rott_dev,df=3)
haz_lin <- predict(flex_nodes_lin,type="hazard")

flex_nodes_fp <- stpm2(Surv(rfs/12,rfsi)~log(nodes),data=Practical_Rott_dev,df=3)
haz_fp <- predict(flex_nodes_fp,type="hazard")

spline3 <- stpm2(Surv(rfs/12,rfsi)~1, data=Practical_Rott_dev,df=3)
haz_spline3 <- predict(spline3,type="hazard")

data_part9 <- data.frame(nodes,haz_lin[nodes],haz_spline3[nodes],haz_fp[nodes])
data_part9_m <- melt(data_part9,id.vars='nodes',factorsAsStrings=F)
plot_part9 <- ggplot(data_part9_m,aes(nodes,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","FP1","Spline 3 knots"),values=c("green","red","blue"))+theme_bw()
plot_part9 + labs(x="Number of positive nodes",y="",color="") + theme(legend.position=c(0.8,0.8))

However, to my mind using "hazard" (or "survival") leads to a plot which do not help to understand the different functional form of "nodes".  Therefore, I would prefer to do this using the linear predictor for each model instead.  I've written the following code to do this:
lp_nodes_lin <- flex_nodes_lin using lm$fitted.values
lp_nodes_spline <- flex_nodes_spline using lm$fitted.values
lp_nodes_fp <- flex_nodes_fp using lm$fitted.values

data_part9 <- data.frame(flex_nodes_lin using lm$model$nodes,lp_nodes_lin,lp_nodes_spline,lp_nodes_fp)
colnames(data_part9)[1] <- "nodes"

data_part9_m <- melt(data_part9,id.vars='nodes')
plot_part9 <- ggplot(data_part9_m,aes(nodes,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","Spline (3 knots)", "FP1"),values=c("green","red","blue"))+theme_bw()
plot_part9 + labs(x="Number of positive nodes",y="Prediction",color="") + theme(legend.position=c(0.8,0.8))

I have 2 concerns over this:

1.       The plots are still not the shape I would expect them to be i.e. a line along the 45 degree line for the linear transformation, and a curve for each of the spline and FP transformations.

2.       This code is really complicated - there must be an easier way?!

Any help gratefully received!

Kind regards,

P.S. If I was doing this in the logistic regression the code would be relatively simple:
age_mod <- glm(DAY30~AGE,family="binomial")
lp_age_lin <- predict(age_mod)

agefp1_mod <- mfp(DAY30~fp(AGE,df=2,alpha=1),family="binomial")
lp_agefp1 <- predict(agefp1_mod)

age3_mod <- glm(DAY30~age3_spline,family="binomial")
lp_age3 <- predict(age3_mod)

data_part8 <- data.frame(AGE,lp_age_lin,lp_agefp1,lp_age3)
data_part8_m <- melt(data_part8,id.vars='AGE')
plot_part8 <- ggplot(data_part8_m,aes(AGE,value,colour=variable))+geom_line()+scale_colour_manual(labels=c("Linear","FP1","Spline 3 knots"),values=c("green","blue","red"))+theme_bw()
plot_part8 + labs(x="Age (years)",y="Linear Predictor (log odds)",color="") + theme(legend.position=c(0.2,0.8))

	[[alternative HTML version deleted]]

More information about the R-help mailing list