[R] data mining for R

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Thu Sep 5 14:36:51 CEST 2002


Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:

> Well, R does not have a `statistics' plug in either!
> 
> In the words of Witten & Franke's book, Data Mining is `statistics plus
> marketing', and R can do a lot of data mining.
> 
> If you could be more specififc about what techniques you want to use, we
> may be able to help you further.
> 
> On Thu, 5 Sep 2002 Pgoodr1 at aol.com wrote:
> 
> > I was wondering if R had a data mining componant and how i could get it. If not do you know anyone who is developing a datamining "plug in" for R
> > Phillip Goodreid

Another possible definition is "statistics with massive amounts of
incidental data". A large part of the DM practices seems to be
"quarrying". The actual statistical methodology is only a part of a
complicated process of getting data out of databases on a, say, weekly
schedule, roughly preprocessed, then fed to a statistics engine, and
postprocessed to something that can end up on the manager's desk.

In my impression that is essentially what SPSS's Clementine product
does, using a GUI to draw arrows between pretty little hexagonal
cells. It is not at all unthinkable that something like that could be
coded up in R too. I think we have most of the pieces to do it.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list