[R] Identifying clusters of size n

Dylan Beaudette dylan.beaudette at gmail.com
Mon Jun 15 04:36:40 CEST 2009


On Sun, Jun 14, 2009 at 7:26 PM, Nathan S.
Watson-Haigh<nathan.watson-haigh at csiro.au> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dylan Beaudette wrote:
>> On Sun, Jun 14, 2009 at 4:39 PM, Nathan S.
>> Watson-Haigh<nathan.watson-haigh at csiro.au> wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Is there a library which is capable of identifying distinct clusters of size n
>>> from a series of XY coordinates?
>>>
>>> Failing this, I'd like to be able to to something like:
>>> Using a sliding window of size n along the x-axis I'd like to determine the
>>> distance between the center of the points in the window and the closest point
>>> outside the window. I could then use a distance cutoff to help define my
>>> clusters of size n. However, how can I calculate this distance?
>>>
>>> Cheers,
>>> Nathan
>>>
>>
>> Here is a start, using PAM clustering:
>>
>> http://casoilresource.lawr.ucdavis.edu/drupal/node/340
>>
>> cheers,
>> Dylan
>

Hi,

>
> Thanks, that looks interesting. However I need a clustering algorithm which has
> the following properties:
>
> 1) The ability to define clusters of size n
> 2) No need to specify a priory how many clusters there will be
> 3) The ability to omit data from any cluster. I don't think this package can do
> this.

Time to do some reading on the various clustering algorithms, their
assumptions, and their overall behaviour. Although I am not an expert,
many of the constraints you are trying to impose on the clustering
will require some kind of programming / decision on your end. It may
help to re-formulate the problem into some kind of raster-operation,
in which case GRASS GIS might be of interest to you.

> I suspect for something like this I'll have to define, a priory, how tight
> points within a cluster should be using some measure.
>

Hmm... In this case you may need to use a model-based / or
density-based approach. See mclust and spatstat packages. (???)

Cheers,

Dylan

> Any thoughts?
> Nathan
>
> - --
> - --------------------------------------------------------
> Dr. Nathan S. Watson-Haigh
> OCE Post Doctoral Fellow
> CSIRO Livestock Industries
> Queensland Bioscience Precinct
> St Lucia, QLD 4067
> Australia
>
> Tel: +61 (0)7 3214 2922
> Fax: +61 (0)7 3214 2900
> Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html
> - --------------------------------------------------------
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (MingW32)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAko1sWMACgkQ9gTv6QYzVL7grwCZAQh72v33vPNJJgEFJEhfyNc3
> 718AnA3k7wvvLEZ4NS1enW3Xp5WhO+qJ
> =1gyG
> -----END PGP SIGNATURE-----
>




More information about the R-help mailing list