[R] Fitting weibull and exponential distributions to left censoring data

Göran Broström goran.brostrom at gmail.com
Sun Nov 2 14:18:30 CET 2008


On Sun, Nov 2, 2008 at 11:22 AM, Christian Ritz <ritz at life.ku.dk> wrote:
> Hi Göran,
>
> the R package NADA is specifically designed for left-censored data:
>
> http://www.statistik.uni-dortmund.de/useR-2008/slides/Helsel+Lee.pdf
>
>
> The function cenreg() in NADA is a front end to survreg().

Nice; but how do you fit these data to a Weibull distribution? I get
it working for the default distribution (lognormal), but not for Weibull:

> cenreg(y, !d, rep(1, length(y)), dist = "lognormal")
                   Value Std. Error     z         p
(Intercept)       -0.138     0.0166  -8.3  1.10e-16
rep(1, length(y))  0.000     0.0000   NaN       NaN
Log(scale)        -0.849     0.0354 -24.0 1.82e-127

Scale= 0.428

Log Normal distribution
Loglik(model)= -738.4   Loglik(intercept only)= -738.4
Loglik-r:  2.107342e-08

Chisq= 0 on 1 degrees of freedom, p= 1
Number of Newton-Raphson Iterations: 1
n = 1000

That's fine , but

> cenreg(y, !d, rep(1, length(y)), dist = "weibull")
Error in checkSlotAssignment(object, name, value) :
  "y" is not a slot in class "survreg"

Göran

> Christian
>
>
>
> Göran Broström wrote:
>> On Fri, Oct 31, 2008 at 2:27 PM, Terry Therneau <therneau at mayo.edu> wrote:
>>>  Use the survreg function.
>>
>> The survreg function cannot fit left censored data (correct me if I am
>> wrong!), neither can phreg or aftreg (package eha). On the other hand,
>> if Borja instead wanted to fit left truncated data (it is a common
>> mistake to confuse left truncation with left censoring), it is
>> possible to use phreg or aftreg, but still not survreg (again, correct
>> me if I am wrong).
>>
>> If instead Borja really wants to fit left censored data, it is fairly
>> simple with the aid of the function optim, for instance by calling
>> this function:
>>
>> left <- function(x, d){
>>     ## d[i] = FALSE: x[i] is left censored
>>     ## d[i] = TRUE: x[i] is observed exactly
>>     loglik <- function(param){# The loglihood function
>>         lambda <- exp(param[2])
>>         p <- exp(param[1])
>>         sum(ifelse(d,
>>                         dweibull(x, p, lambda, log = TRUE),
>>                         pweibull(x, p, lambda, log.p = TRUE)
>>                    )
>>             )
>>     }
>>     par <- c(0, 0)
>>     res <- optim(par, loglik, control = list(fnscale = -1), hessian = TRUE)
>>     list(log.shape = res$par[1],
>>          log.scale = res$par[2],
>>          shape = exp(res$par[1]),
>>          scale = exp(res$par[2]),
>>          var.log = solve(-res$hessian)
>>          )
>> }
>>
>> Use like this:
>>
>>> x <- rweibull(500, shape = 2, scale = 1)
>>> d <- x > median(x) # 50% left censoring, Type II.
>>> y <- ifelse(d, x, median(x))
>>> left(y, d)
>>
>> $log.shape
>> [1] 0.707023
>>
>> $log.scale
>> [1] -0.007239744
>>
>> $shape
>> [1] 2.027945
>>
>> $scale
>> [1] 0.9927864
>>
>> $var.log
>>              [,1]         [,2]
>> [1,] 0.0022849526 0.0005949114
>> [2,] 0.0005949114 0.0006508635
>>
>>
>>>  There are many different ways to parameterize a Weibull.  The survreg function
>>> imbeds it a general location-scale familiy, which is a different
>>> parameterization than the rweibull function.
>>>
>>>> y <- rweibull(1000, shape=2, scale=5)
>>>> survreg(Surv(y)~1, dist="weibull")
>>> Coefficients:
>>> (Intercept)
>>>   1.592543
>>>
>>> Scale= 0.5096278
>>>
>>> Loglik(model)= -2201.9   Loglik(intercept only)= -2201.9
>>>
>>> ----
>>>
>>>  survreg's scale  =    1/(rweibull shape)
>>>  survreg's intercept = log(rweibull scale)
>>>  For the log-likelihood all parameterizations lead to the same value.
>>>
>>>  There is not "right" or "wrong" parameterization for a Weibull (IMHO),
>>
>> Correct, but there are two points I would like to add to that:
>> (i) It is a good idea to perform optimisation with a parametrization
>> that give no range restrictions.
>>
>> (ii) It is a good idea to transform back the results to the
>> parametrization that is standard in R, for comparative reasons.
>>
>> See for example the function 'left' above.
>>
>>> but
>>> there certainly is a lot of room for confusion.  This comes up enough that I
>>> have just added it as an example in the survreg help page, which will migrate to
>>> the general R distribution in due course.
>>>
>>>  Terry Therneau
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> Original posting: http://tolstoy.newcastle.edu.au/R/e5/help/08/10/5673.html
>



-- 
Göran Broström


More information about the R-help mailing list