[R] Fitting weibull and exponential distributions to leftcensoring data
R Heberto Ghezzo, Dr
heberto.ghezzo at mcgill.ca
Sun Nov 2 15:32:14 CET 2008
Hi,
I tried the suggestion of Terry, as he stated it, it works perfectly but if I have
>>> x <- rweibull(500, shape = 2, scale = 1)
>>> d <- rep(c(TRUE,FALSE),each=250) # 50% left censoring, at different times
>>> left(x, d)
the scale comes as 0.7 and shape 1.5
Any one knows a solution?
Thanks
Heberto.Ghezzo
McGill University
-----Original Message-----
From: r-help-bounces at r-project.org on behalf of Göran Broström
Sent: Sun 11/2/2008 8:18 AM
To: Christian Ritz
Cc: r-help at r-project.org; dhelsel at usgs.gov
Subject: Re: [R] Fitting weibull and exponential distributions to leftcensoring data
On Sun, Nov 2, 2008 at 11:22 AM, Christian Ritz <ritz at life.ku.dk> wrote:
> Hi Göran,
>
> the R package NADA is specifically designed for left-censored data:
>
> http://www.statistik.uni-dortmund.de/useR-2008/slides/Helsel+Lee.pdf
>
>
> The function cenreg() in NADA is a front end to survreg().
Nice; but how do you fit these data to a Weibull distribution? I get
it working for the default distribution (lognormal), but not for Weibull:
> cenreg(y, !d, rep(1, length(y)), dist = "lognormal")
Value Std. Error z p
(Intercept) -0.138 0.0166 -8.3 1.10e-16
rep(1, length(y)) 0.000 0.0000 NaN NaN
Log(scale) -0.849 0.0354 -24.0 1.82e-127
Scale= 0.428
Log Normal distribution
Loglik(model)= -738.4 Loglik(intercept only)= -738.4
Loglik-r: 2.107342e-08
Chisq= 0 on 1 degrees of freedom, p= 1
Number of Newton-Raphson Iterations: 1
n = 1000
That's fine , but
> cenreg(y, !d, rep(1, length(y)), dist = "weibull")
Error in checkSlotAssignment(object, name, value) :
"y" is not a slot in class "survreg"
Göran
> Christian
>
>
>
> Göran Broström wrote:
>> On Fri, Oct 31, 2008 at 2:27 PM, Terry Therneau <therneau at mayo.edu> wrote:
>>> Use the survreg function.
>>
>> The survreg function cannot fit left censored data (correct me if I am
>> wrong!), neither can phreg or aftreg (package eha). On the other hand,
>> if Borja instead wanted to fit left truncated data (it is a common
>> mistake to confuse left truncation with left censoring), it is
>> possible to use phreg or aftreg, but still not survreg (again, correct
>> me if I am wrong).
>>
>> If instead Borja really wants to fit left censored data, it is fairly
>> simple with the aid of the function optim, for instance by calling
>> this function:
>>
>> left <- function(x, d){
>> ## d[i] = FALSE: x[i] is left censored
>> ## d[i] = TRUE: x[i] is observed exactly
>> loglik <- function(param){# The loglihood function
>> lambda <- exp(param[2])
>> p <- exp(param[1])
>> sum(ifelse(d,
>> dweibull(x, p, lambda, log = TRUE),
>> pweibull(x, p, lambda, log.p = TRUE)
>> )
>> )
>> }
>> par <- c(0, 0)
>> res <- optim(par, loglik, control = list(fnscale = -1), hessian = TRUE)
>> list(log.shape = res$par[1],
>> log.scale = res$par[2],
>> shape = exp(res$par[1]),
>> scale = exp(res$par[2]),
>> var.log = solve(-res$hessian)
>> )
>> }
>>
>> Use like this:
>>
>>> x <- rweibull(500, shape = 2, scale = 1)
>>> d <- x > median(x) # 50% left censoring, Type II.
>>> y <- ifelse(d, x, median(x))
>>> left(y, d)
>>
>> $log.shape
>> [1] 0.707023
>>
>> $log.scale
>> [1] -0.007239744
>>
>> $shape
>> [1] 2.027945
>>
>> $scale
>> [1] 0.9927864
>>
>> $var.log
>> [,1] [,2]
>> [1,] 0.0022849526 0.0005949114
>> [2,] 0.0005949114 0.0006508635
>>
>>
>>> There are many different ways to parameterize a Weibull. The survreg function
>>> imbeds it a general location-scale familiy, which is a different
>>> parameterization than the rweibull function.
>>>
>>>> y <- rweibull(1000, shape=2, scale=5)
>>>> survreg(Surv(y)~1, dist="weibull")
>>> Coefficients:
>>> (Intercept)
>>> 1.592543
>>>
>>> Scale= 0.5096278
>>>
>>> Loglik(model)= -2201.9 Loglik(intercept only)= -2201.9
>>>
>>> ----
>>>
>>> survreg's scale = 1/(rweibull shape)
>>> survreg's intercept = log(rweibull scale)
>>> For the log-likelihood all parameterizations lead to the same value.
>>>
>>> There is not "right" or "wrong" parameterization for a Weibull (IMHO),
>>
>> Correct, but there are two points I would like to add to that:
>> (i) It is a good idea to perform optimisation with a parametrization
>> that give no range restrictions.
>>
>> (ii) It is a good idea to transform back the results to the
>> parametrization that is standard in R, for comparative reasons.
>>
>> See for example the function 'left' above.
>>
>>> but
>>> there certainly is a lot of room for confusion. This comes up enough that I
>>> have just added it as an example in the survreg help page, which will migrate to
>>> the general R distribution in due course.
>>>
>>> Terry Therneau
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>
> Original posting: http://tolstoy.newcastle.edu.au/R/e5/help/08/10/5673.html
>
--
Göran Broström
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list