[R] cluster

Weiwei Shi helprhelp at gmail.com
Mon Jul 25 23:45:12 CEST 2005

Dear listers:

Here I have a question on clustering methods available in R. I am
trying to down-sampling the majority class in a classification problem
on an imbalanced dataset. Since I don't want to lose information in
the original dataset, I don't want to use naive down-sampling: I think
using clustering on the majority class' side to select
"representative" samples might help. So, my question is, which
clustering method should be tested to get the best result. I think the
key thing might be the selection of "distance" considering the next
step in which I would like to use  decision trees.

Please share your experience in using clustering (Any available
implementation outside R is also welcome)

Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III

More information about the R-help mailing list