[R] rpart help

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Tue Jan 21 16:40:03 CET 2003


On Tue, 21 Jan 2003, Doug Kitch wrote:

> Hello.  I am not sure if you can help me or not but I have a dataset with
> N ~ 4000 with binary response and p ~ 0.08, regardless of how many or
> how few variables I offer I get the following message: 'Error in
> rpart(formula, method="class"): No splits could be created Dumped.' If I
> run tree with the same dataset (no missing data) in S I get results.  Is
> there a problem with large datasets in rpart?

If there were it would not be relevant: 4000 is not close to `large'.

I suspect you ought to be using losses with such a skewed binary 
response, and am not surprised that no single split is effective.

?rpart.control should help you.

> Also, do you happen to know the parameter options which
> will make rpart and tree act the same.  I am wondering if
> this is possible since I have no missing data.

It's not exactly possible, but look in MASS4 for some comparisons.
Given that tree in S does not do what it is documented to do, it would be 
hard to reproduce, but tree in R comes pretty close to tree in S's 
documented behaviour.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list