Gough Lauren
Tue Oct 7 12:11:54 CEST 2008
Dear All,
I have two questions regarding distribution fitting.
I have several datasets, all left-truncated at x=1, that I am attempting
to fit distributions to (lognormal, weibull and exponential). I had
been using fitdistr in the MASS package as follows:
fitdistr<-(x,"weibull")
However, this does not take into consideration the truncation at x=1. I
read another posting in this forum that suggested using the argument
"lower" to truncate the distribution fitting. However, this does not
seem to be working. For example, when I attempt to fit a weibull
distribution truncated at x=1 using "lower", it seems to set the
best-fit shape parameter at 1:
> fitdistr(x,"weibull",lower=1)
shape scale
1.00000000 9.87964337
(0.02358731) (0.40649570) ##I have tried this on other datasets also
truncated at x=1 and get the same result (i.e. shape=1).
Does anyone know how to successfully fit the exponential, weibull and
lognormal distributions to truncated data?
Secondly, as my datasets are large (>1000 data points) assessing the fit
of the distribution with kolmogorov smirnov goodness of fit tests is
routinely showing statistical significance for all distributions.
Therefore, I would like to plot the observed data with the theoretical
best fit distributions (weibull, exponential and lognormal) to visually
assess which fits the observed data best. So far I have been doing this
as follows:
>fitdistr(x,"weibull")
shape scale
a b
>D1<-density(x) ##density distribution of observed data
>D2<-density(rweibull(1500,shape=a,scale=b)) ##density of a random
variable following the theoretical best fit weibull distribution with
shape parameter =a, scale parameter = b.
>plot(range(D1$x),range(D1$y,D2$y),type="n",xlab="x",ylab="Density")
>lines(D1,col="red")
>lines(D2,col="blue")
This successfully plots the two density curves on the same graph, but it
plots data below the x=1 threshold - even for the observed data! I have
tried limiting the scale of x-axis using xlim=c(1,150) but the graph
still plots the origin of the graph as (0,0). I can only get different
origins if I limit x more extremely e.g. xlim=c(50,150). Does anyone
know how I can successfully change the origin of the graph to (1,0)?
Sorry for the long e-mail! Any help would be greatly appreciated.
Regards,
Lauren
