[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

Göran Broström goran.brostrom at gmail.com
Wed Oct 8 20:07:26 CEST 2008


The package 'eha' fits these distributions (and more) with general
left truncation and right censoring, and also regression models a la
survreg. Look at 'phreg' for parametric proportional hazards models
and 'aftreg' for accelerated failure time models. In your case of no
covariates, the two functions give (of course) identical results.

Hth,
Göran

On Wed, Oct 8, 2008 at 1:09 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On Wed, 8 Oct 2008, Gough Lauren wrote:
>
>> Hi,
>>
>> Thank you very much for your reply. This seems to be working OK when
>> fitting weibull and lognormal distributions.  However, fitdistr now
>> requires me to include start values:
>
> As documented.
>
>>> ltwei<-function(x,shape,scale,log=FALSE){
>>
>> + dweibull(x,shape,scale,log)/pweibull(1,shape,scale,lower=FALSE)
>> + }
>>>
>>> ltweifit<-fitdistr(x,ltwei) # x is observed data
>>
>> Error in fitdistr(x, ltwei) : 'start' must be a named list
>>>
>>> ltweifit<-fitdistr(x,ltwei,start=list(shape=0.5,scale=0.5))
>>
>> There were 34 warnings (use warnings() to see them)
>>>
>>> ltweifit
>>
>>     shape         scale
>>  1.11108278   13.00703630
>> ( 0.01936651) ( 0.42897340)
>>
>> Is there anyway I can fit to truncated data without having to name start
>> values?  Alternatively, is there any recommended technique for choosing
>> sensible start values?
>
> Not really, depends how heavy the truncation is.
>
>> Further, when I try to fit an exponential distribution I get an error
>> message:
>
> But a truncated exponential is just a shifted exponential and has one
> parameter -- you gave it two!  Just fit an exponential to x-1.
>
>>> ltexp<-function(x,rate,log=FALSE){
>>
>> + dexp(x,rate,log)/pexp(1,rate,lower=FALSE)
>> + }
>>>
>>> ltexpfit<-fitdistr(x,ltexp)
>>
>> Error in fitdistr(x, ltexp) : 'start' must be a named list
>>>
>>> ltexpfit<-fitdistr(x,ltexp,start=list(0.1))
>>
>> Warning message:
>> In optim(x = c(2.541609, 1.436143, 4.600524, 6.437174, 2.84974,  :
>>  one-diml optimization by Nelder-Mead is unreliable: use optimize
>>>
>>> ltexpfit
>>
>> Error in dn[[2]] : subscript out of bounds
>>
>> This error message seems to occur regardless of the start value used.
>> Do you know why this is?
>>
>> Sorry to pester you again, and apologies if I am asking silly questions
>> - my knowledge of R and probability distributions (except the normal!)
>> are rather limited!
>>
>> Best wishes
>>
>> Lauren
>>
>> -----Original Message-----
>> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
>> Sent: 07 October 2008 12:25
>> To: Richard.Cotton at hsl.gov.uk
>> Cc: Gough Lauren; vito muggeo; r-help at r-project.org
>> Subject: Re: [R] Fitting weibull, exponential and lognormal
>> distributions to left-truncated data.
>>
>> On Tue, 7 Oct 2008, Richard.Cotton at hsl.gov.uk wrote:
>>
>>>>> I have several datasets, all left-truncated at x=1, that I am
>>>
>>> attempting
>>>>>
>>>>> to fit distributions to (lognormal, weibull and exponential).  I had
>>
>>>>> been using fitdistr in the MASS package as follows:
>>>
>>>> A possible solution is to use the survreg() in the survival package
>>>> without specifying the covariates, i.e.
>>>>
>>>> library(survival)
>>>> survreg(Surv(..)~1, dist="weibull")
>>>>
>>>> where Surv(..) accepts information about "times",
>>>> censoring/truncation variables and dist allows to specify alternative
>>
>> distributions.
>>>>
>>>> See ?Surv e ?survreg
>>>
>>> The survival package is mostly targeted at right-censored data.  The
>>> NADA package provides wrappers for many of the survival routines so
>>> they work with left-censored data.
>>
>> Left-censoring and left-truncation are not the same thing.  With
>> left-censoring you see that you had observations < 1, and with
>> left-truncation you do not (at least how the terms are usually applied:
>> occasionally the meanings are reversed).
>>
>> For left-truncation it is relatively easy, e.g.
>>
>> ltwei <- function(x, shape, scale = 1, log = FALSE)
>>    dweibull(x, shape, scale, log)/pweibull(1, shape, scale,
>> lower=FALSE)
>>
>> and use this in fitdistr.
>>
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>> This message has been checked for viruses but the contents of an
>> attachment
>> may still contain software viruses, which could damage your computer
>> system:
>> you are advised to perform your own checks. Email communications with the
>> University of Nottingham may be monitored as permitted by UK legislation.
>>
>>
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Göran Broström


More information about the R-help mailing list