[R] Hierarchical Cluster Analysis with large dataset

Sarah Goslee sarah.goslee at gmail.com
Sun Nov 3 23:01:30 CET 2013


Hi,

I think your dataset is too large to be interpretable, but in general
you should check out the cluster package, specifically clara(), which
is intended for use with large data.

Sarah

On Sun, Nov 3, 2013 at 4:42 AM, Petar Milin
<petar.milin at uni-tuebingen.de> wrote:
> Hello!
> Can anyone give me advice on running Hierarchical Cluster Analysis on large
> datasets? For example, 80000x10000. Calculating distances on such a
> dataframe seems impossible even on very powerful computer.
>
> Also, any other advice that would lead to reduction of dimensionality,
> i.e., cluster/group variables would be more than welcomed.
>
> Many thanks,
> PM
>
-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list