[R] passing known medoids to clara() in the cluster package

Dylan Beaudette dylan.beaudette at gmail.com
Wed May 17 22:21:05 CEST 2006


Martin,

Just wanted to check on the status of including known medoids into calls to 
the clara() function within the cluster package.

Cheers,

Dylan

On Monday 10 April 2006 14:25, Dylan Beaudette wrote:
> Thanks for the reply.
>
> On Sunday 09 April 2006 11:46 pm, Martin Maechler wrote:
> > >>>>> "DylanB" == Dylan Beaudette <dylan.beaudette at gmail.com>
> > >>>>>     on Sun, 9 Apr 2006 19:28:44 -0700 writes:
> >
> >     DylanB> Greetings, I have had good success using the clara()
> >     DylanB> function to perform a simple cluster analysis on a
> >     DylanB> large dataset (1 million+ records with 9 variables).
> >
> >     DylanB> Since the clara function is a wrapper to pam(),
> >     DylanB> which will accept known medoid data - I am wondering
> >     DylanB> if this too is possible with clara() ... The
> >     DylanB> documentation does not suggest that this is
> >     DylanB> possible.
> >
> > indeed, it doesn't --  because it's not yet possible.
> > I (as maintainer of "cluster") had added the ``known medoid''
> > option to pam() a while ago last June (for  cluster version 1.10.0),
> > and had left a note my TODO file to do the same for clara().
>
> Ah. that would explain things ! :) . I will check back periodically to see
> when this feature is completed.
>
> > Unfortunately it's not true that clara() was a wrapper to pam()
> > as you state above.
>
> I must have misread the manual pages...
>
> > Given your wish and clear "use case" situation, I'm more
> > motivated to approach this particular 'TODO' item!
> >
> > Martin Maechler, ETH Zurich
> >
> >     DylanB> Essentially I am trying to implement a "supervised
> >     DylanB> classification" of numerous geographic data
> >     DylanB> layers. The "unsupervised" approach using clara()
> >     DylanB> works well, but I feel the output classes would be
> >     DylanB> more meaningful if I were able to let clara() know
> >     DylanB> about the classes that I have in mind.
> >
> >     DylanB> Is this at all feasible, or am I trying to
> >     DylanB> accomplish something that is not possible?
>
> Thanks Martin!
>
> I will give pam() a try, and see if it can handle the large dataset that I
> am currently using clara() for -- usually only about 5 seconds are required
> for clara() to complete.

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341




More information about the R-help mailing list