[R] Finding overlaps in vector

Fri Dec 21 10:56:30 CET 2007

<posted & mailed>

Dear all,

I'm trying to solve the problem, of how to find clusters of values in a
vector that are closer than a given value. Illustrated this might look as
follows:

vector <- c(0,0.45,1,2,3,3.25,3.33,3.75,4.1,5,6,6.45,7,7.1,8)

When using '0.5' as the proximity requirement, the following groups would
result:
0,0.45
3,3.25,3.33,3.75,4.1
6,6.45
7,7.1

Jim Holtman proposed a very elegant solution in
http://tolstoy.newcastle.edu.au/R/e2/help/07/07/21286.html, which I have
modified and perused since he wrote it to me. The beauty of this approach
is that it will not only work for constant proximity requirements as above,
but also for overlap-windows defined in terms of ppm around each value.
Now I have an additional need and have found no way (short of iteratively
step through all the groups returned) to figure out how to do that with
Jim's approach: how to figure out that 6,6.45 and 7,7.1 are separate
clusters?

Thanks for any hints, Joh