[R] Fitting inter-arrival time data

M. Edward Borasky znmeb at aracnet.com
Mon Jun 30 01:23:15 CEST 2003


I have a collection of data which includes inter-arrival times of requests
to a server. What I've done so far with it is use "sm.density" to explore
the distribution, which found two large peaks. However, the peaks are made
up of Gaussians, and that's not really correct, because the inter-arrival
time can never be less than zero. In fact, the leftmost peak is centered at
somewhere around ten seconds, and quite a bit of it extends into negative
territory.

What I'd like to do is fit this dataset to a mixture (sum) of exponentials,
hyper-exponentials and hypo-exponentials. My preference is to use the
well-known branching Erlang approximation (exponential stages) to the hyper-
and hypo-exponentials. In this approximation, a distribution is specified by
its mean and coefficient of variation.

So far, what I've been able to come up with in a literature search has been
something called the Expectation Maximization algorithm. And I haven't been
able to locate R code for this. So my questions are:

1. Is EM the "right way" to go about this, or is there something better?
2. Is there some EM code in R that I could experiment with, or do I need to
write my own?
3. Is there a way this could be done using the existing R kernel density
estimators and some kind of kernel that is zero for negative values of its
argument? 

-- 
M. Edward (Ed) Borasky
mailto:znmeb at borasky-research.net
http://www.borasky-research.net
 
"Suppose that tonight, while you sleep, a miracle happens - you wake up
tomorrow with what you have longed for! How will you discover that a miracle
happened? How will your loved ones? What will be different? What will you
notice? What do you need to explode into tomorrow with grace, power, love,
passion and confidence?" -- L. Michael Hall, PhD




More information about the R-help mailing list