[R] Problem getting R's decision tree for Quinlan's golf exam ple data [Broadcast]

Liaw, Andy andy_liaw at merck.com
Mon Apr 17 02:56:00 CEST 2006


See ?rpart.control.  I get:
 
> golf.rp = rpart(Outlook ~ ., golf, control=rpart.control(minsplit=1))
> golf.rp
n= 14 
node), split, n, loss, yval, (yprob)
      * denotes terminal node
 1) root 14 9 rain (0.2857143 0.3571429 0.3571429)  
   2) Temperature< 71.5 6 2 rain (0.1666667 0.6666667 0.1666667)  
     4) Temperature< 64.5 1 0 overcast (1.0000000 0.0000000 0.0000000) *
     5) Temperature>=64.5 5 1 rain (0.0000000 0.8000000 0.2000000)  
      10) Humidity>=75 3 0 rain (0.0000000 1.0000000 0.0000000) *
      11) Humidity< 75 2 1 rain (0.0000000 0.5000000 0.5000000)  
        22) Temperature< 67 1 0 rain (0.0000000 1.0000000 0.0000000) *
        23) Temperature>=67 1 0 sunny (0.0000000 0.0000000 1.0000000) *
   3) Temperature>=71.5 8 4 sunny (0.3750000 0.1250000 0.5000000)  
     6) PlayDontPlay=Play 5 2 overcast (0.6000000 0.2000000 0.2000000)  
      12) Humidity>=72.5 4 1 overcast (0.7500000 0.2500000 0.0000000)  
        24) Temperature>=78 2 0 overcast (1.0000000 0.0000000 0.0000000) *
        25) Temperature< 78 2 1 overcast (0.5000000 0.5000000 0.0000000)  
          50) Temperature< 73.5 1 0 overcast (1.0000000 0.0000000 0.0000000)
*
          51) Temperature>=73.5 1 0 rain (0.0000000 1.0000000 0.0000000) *
      13) Humidity< 72.5 1 0 sunny (0.0000000 0.0000000 1.0000000) *
     7) PlayDontPlay=DontPlay 3 0 sunny (0.0000000 0.0000000 1.0000000) *

Andy

  _____  

From: r-help-bounces at stat.math.ethz.ch on behalf of Alan Lapedes
Sent: Sun 4/16/2006 5:14 PM
To: r-help at stat.math.ethz.ch
Subject: [R] Problem getting R's decision tree for Quinlan's golf example
data [Broadcast]



Newbie question, but I've checked archives etc. Am trying to reproduce 
in R Quinlan's trivial example of the "golf" decision tree. The data file 
of 14 examples follows (read in via read.table()): 

Outlook Temperature Humidity Windy PlayDontPlay 
1 sunny 85 85 false DontPlay 
2 sunny 80 90 true DontPlay 
3 overcast 83 78 false Play 
4 rain 70 96 false Play 
5 rain 68 80 false Play 
6 rain 65 70 true DontPlay 
7 overcast 64 65 true Play 
8 sunny 72 95 false DontPlay 
9 sunny 69 70 false Play 
10 rain 75 80 false Play 
11 sunny 75 70 true Play 
12 overcast 72 90 true Play 
13 overcast 81 75 false Play 
14 rain 71 80 true DontPlay 

R reports no format or other trivial problems: 
> summary(golf) 
     Outlook   Temperature      Humidity      Windy     PlayDontPlay 
 overcast:4   Min.   :64.0   Min.   :65.0   false:8   DontPlay:5    
 rain    :5   1st Qu.:69.2   1st Qu.:71.2   true :6   Play    :9    
 sunny   :5   Median :72.0   Median :80.0                           
              Mean   :73.6   Mean   :80.3                           
              3rd Qu.:78.8   3rd Qu.:88.8                           
              Max.   :85.0   Max.   :96.0             

I then try to build a decision tree: 
> golf.rpart <- rpart(PlayDontPlay ~ Outlook + Temperature + Humidity +
Windy, method="class", data=golf) 

which doesn't yield a tree: 
> golf.rpart 
n= 14 

node), split, n, loss, yval, (yprob) 
      * denotes terminal node 

1) root 14 5 Play (0.35714 0.64286) * 

> plot(golf.rpart) 
Error in plot.rpart(golf.rpart) : fit is not a tree, just a root 

Thanks, 
Alan 

______________________________________________ 
R-help at stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>  
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>




More information about the R-help mailing list