[R] 3-parametric Weibull regression

Pfrengle Andreas (GS-SI/ENX7) Andreas.Pfrengle at de.bosch.com
Tue Jan 31 16:40:49 CET 2012


Hello Terry,

thank you for your help. I've tried your suggestion now and worked out the following code. What I forgot to mention in my last mail is that I have interval-censored data (I'm testing with discrete increasing fluid-volumes). I would be thankful if you reviewed my solution:

myfun <- function(offset, vol_min, vol, status, x) {
  vol_min2 <- vol_min - offset
  vol2 <- vol - offset
  fit <- survreg(Surv(vol_min, vol, status, "interval") ~ x, dist="weibull", subset=(vol_min2 > 0))
  fit$loglik[2]
}

I've edited "myfun" to return fit$loglik[2], the model's loglik, otherwise I'd get an error "invalid function value in 'optimize'"
To get the optimized model over categorical factor "Gruppe" and continuous factor "d", I did:

fit0 <- survreg(Surv(Vol_min, Vol, status, "interval") ~ Gruppe + d + Gruppe * d, data=data, x=T, y=T)
ofit <- optimize(myfun, c(0, .99*min(data$Vol_min[data$status!=0])), vol_min=exp(fit0$y[,1]), vol=exp(fit0$y[,2]), status=fit0$y[,3], x=fit0$x, maximum=T)
fit1 <- survreg(Surv(Vol_min - ofit$maximum, Vol - ofit$maximum, status, "interval") ~ Gruppe + d + Gruppe * d, data=data, x=T, y=T)
summary(fit1)

I needed to exp() the fit0$y results since they were logarithmized, so I get the original values again into "myfun"
The fit1 then builds the model with the optimized offset-value.
My next question is: How do I use the output of fit1 to build the formula for the distribution (i.e. the "a" and "b" parameters described in the dweibull-help, depending on my factors)? Is the following correct (example-plot for group 0, d=300)?
Also, is my assumption correct that the weibull exponent ($scale) isn't modelled in dependence of the factors, but constant for the whole dataset?

a <- 1 / fit1$scale
b <- function(Gruppe, d) exp(fit1$coeff[1] + fit1$coeff[2] * Gruppe + fit1$coeff[3] * d + fit1$coeff[4] * Gruppe * d)
x <- seq(0,5,.02)
plot(x, dweibull(x - ofit$maximum, a, b(0, 300)))



Mit freundlichen Grüßen / Best regards

Dr. Andreas Pfrengle

Robert Bosch GmbH
 (GS-SI/ENX7-Fe)
Postfach 30 02 20
70442 Stuttgart
GERMANY
www.bosch.com

Tel. +49 711 811-3622390
andreas.pfrengle at de.bosch.com

Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart, HRB 14000;
Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz Fehrenbach, Siegfried Dais;
Stefan Asenkerschbaumer, Bernd Bohr, Rudolf Colm, Volkmar Denner, Christoph Kübel, Uwe Raschke,
Wolf-Henning Scheider, Werner Struth, Peter Tyroller


-----Ursprüngliche Nachricht-----
Von: Terry Therneau [mailto:therneau at mayo.edu]
Gesendet: Freitag, 27. Januar 2012 15:18
An: Pfrengle Andreas (GS-SI/ENX7)
Cc: r-help at r-project.org
Betreff: Re: 3-parametric Weibull regression

--- begin included message ---

Hello,

I'm quite new to R and want to make a Weibull-regression with the
survival package. I know how to build my "Surv"-object and how to make a
standard-weibull regression with "survreg".
However, I want to fit a translated or 3-parametric weibull dist to
account for a failure-free time.
I think I would need a new object in survreg.distributions, but I don't
know how to create that correctly. I've tried to inherit it from the
"extreme" value distribution like so
......

---  end inclusion ------

 I don't think that this is possibile through the survreg.distributions
approach.  One problem is an early censoring: say an observation was
censored at time 10, and your delay time estimate were 15: the
underlying routine would need to drop that obs from the calculations,
and there is no mechanism to do that.
  An althernative approach would be to include survreg in an optimize
call.  Here is an example

 myfun <- function(lower, time, status, x) {
        time2 <- time-lower
        fit <- survreg(Surv(time2, status) ~x, dist="wiebull",
                          subset= (time2 > 0))
        fit$loglik
        }
 fit0 <- survreg(Surv(time, status) ~ x1 + x2 + ...., data=yd,
        x=T, y=T)
 ofit <- optimize(myfun, c(0, .99*min(yd$time[yd$status==1])),
        time=fit0$y[,1], status=fit0$y[,1], x=fit0$x, maximum=T)

This will give you the optimal value for the threshold.  The .99 is to
stop the routine from an intermediate solution where there is a death at
exactly time 0.  The data set is yd="your data".

Terry Therneau



More information about the R-help mailing list