[R] Subgroup discovery in R

Hans Werner Borchers hwborchers at googlemail.com
Thu Jan 25 10:55:55 CET 2007


I would very much like to apply "subgroup discovery" techniques to some
of the data I am analyzing at this moment.

Subgroup discovery is an interesting approach and is quite well known in
the Data Mining community, though in essence it is a purely statistical
approach.

To read an introductory article see "Subgroup discovery and visualization
methods" <soleunet.ijs.si/website/other/final_report/html/WP5-s14.html>.
To my knowledge it was originally developed in the 1990ies by W. Klösgen
and Stefan Wrobel.

I can only recommend this technique as I have had some/many successes and
surprising insights into data when in the past I had access to commercial
versions of it (MIDOS in the KEPLER tool).

Please don't mix it up with "subgroup analysis" which is often mentioned
in clinical studies, but is not meant as a discovery technique.

My R site searches have not uncovered any hits, also I am following the
R-help list for some time now and cannot remember seeing any hints to it.

If there is no program in R or piece of code that could be reused, then
I would start writing my own simple version, though it's not quite trivial
to implement and may be slow on large data sets.

Many thanks, Hans Werner Borchers



More information about the R-help mailing list