[R] rpart problem

Don Stierman dstierman at cableone.net
Sun Apr 28 08:41:00 CEST 2002


I am having problems with rpart and a particular dataset. I try the
following code and get no significant results back from the script. However,
if I delete the value of 79 in the last row and column of the dataset, I do
get results. Is rpart really that dependent on a single value, or is there
something else wrong? How do I fix this so that I will get results without
having to change the dataset? Also, what is a good way to determine if a
single value or a single column is causing problems in the rpart package?
Thanks, Don

R script:

library (rpart)
data<-read.csv("C:\\temp.txt")
fit <- rpart(data$I ~ ., data,method="anova")
fit

Results with value of 79:

n= 703

node), split, n, deviance, yval
      * denotes terminal node

1) root 703 30738.37 12.02987 *

Results without value of 79:

n=702 (1 observations deleted due to missing)

node), split, n, deviance, yval
      * denotes terminal node

 1) root 702 26246.990 11.93447
   2) C>=0.325 615 19389.630 11.71382
     4) D< 11.635 522 12578.270 11.47893 *
     5) D>=11.635 93  6620.903 13.03226
      10) D>=11.655 82  1849.561 12.07317 *
      11) D< 11.655 11  4133.636 20.18182 *
   3) C< 0.325 87  6615.747 13.49425
     6) A< 28.68 69  1872.638 12.59420 *
     7) A>=28.68 18  4472.944 16.94444 *
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: temp.txt
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20020428/b24c0219/temp.txt


More information about the R-help mailing list