[R] Fitting weibull, exponential and lognormal distributions to left-truncated data.

Tue Oct 7 12:11:54 CEST 2008

Dear All,

I have two questions regarding distribution fitting.

I have several datasets, all left-truncated at x=1, that I am attempting
to fit distributions to (lognormal, weibull and exponential).  I had
been using fitdistr in the MASS package as follows:

fitdistr<-(x,"weibull")

However, this does not take into consideration the truncation at x=1.  I
read another posting in this forum that suggested using the argument
"lower" to truncate the distribution fitting.  However, this does not
seem to be working.  For example, when I attempt to fit a weibull
distribution truncated at x=1 using "lower", it seems to set the
best-fit shape parameter at 1:

> fitdistr(x,"weibull",lower=1)
     shape        scale   
  1.00000000   9.87964337 
 (0.02358731) (0.40649570) ##I have tried this on other datasets also
truncated at x=1 and get the same result (i.e. shape=1).

Does anyone know how to successfully fit the exponential, weibull and
lognormal distributions to truncated data?

Secondly, as my datasets are large (>1000 data points) assessing the fit
of the distribution with kolmogorov smirnov goodness of fit tests is
routinely showing statistical significance for all distributions.
Therefore, I would like to plot the observed data with the theoretical
best fit distributions (weibull, exponential and lognormal) to visually
assess which fits the observed data best.  So far I have been doing this
as follows:

>fitdistr(x,"weibull")
shape        scale   
  a   		b 

>D1<-density(x) ##density distribution of observed data
>D2<-density(rweibull(1500,shape=a,scale=b)) ##density of a random
variable following the theoretical best fit weibull distribution with
shape parameter =a, scale parameter = b.

>plot(range(D1$x),range(D1$y,D2$y),type="n",xlab="x",ylab="Density")
>lines(D1,col="red")
>lines(D2,col="blue")

This successfully plots the two density curves on the same graph, but it
plots data below the x=1 threshold - even for the observed data!  I have
tried limiting the scale of x-axis using xlim=c(1,150) but the graph
still plots the origin of the graph as (0,0).  I can only get different
origins if I limit x more extremely e.g. xlim=c(50,150).  Does anyone
know how I can successfully change the origin of the graph to (1,0)?

Sorry for the long e-mail! Any help would be greatly appreciated.

Regards,

Lauren

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.