[R] Clustering Large Applications..sort of

Peter Langfelder peter.langfelder at gmail.com
Wed Aug 10 23:18:10 CEST 2011


On Wed, Aug 10, 2011 at 12:07 PM, Ken Hutchison <vicvoncastle at gmail.com> wrote:
> Hello all,
>   I am using the clustering functions in R in order to work with large
> masses of binary time series data, however the clustering functions do not
> seem able to fit this size of practical problem. Library 'hclust' is good
> (though it may be sub par for this size of problem, thus doubly poor for
> this application) in that I do not want to make assumptions about the number
> of clusters present, also due to computational resources and time hclust is
> not functionally good enough;

How big is your problem? If your distance (dissimilarity) fits in the
memory of your machine, packages flashClust and fastCluster provide
much faster implementations of hierarchical clustering than the stock
R function hclust.

Peter



More information about the R-help mailing list