[R] Rpart, custom penalty for an error

Maciej Bliziński m.blizinski at wit.edu.pl
Sun Sep 10 19:09:42 CEST 2006


Hello all R-help list subscribers,

I'd like to create a regression tree of a data set with binary response
variable. Only 5% of observations are a success, so the regression tree
will not find really any variable value combinations that will yield
more than 50% of probability of success. I am however interested in
areas where the probability of success is noticeably higher than 5%, for
example 20%. I've tried rpart and the weights option, increasing the
weights of the success-observations.

It works as expected in terms of the tree creation: instead of a single
root, a tree is being built. But the tree plot() and text() are somewhat
misleading. I'm interested in the observation counts inside each leaf.
I use the "use.n = TRUE" parameter. The counts displayed are misleading,
the numbers of successes are not the original numbers from the sample,
they seem to be cloned success-observations.

I'd like to split the tree just as weights parameter allows me to,
keeping the original number of observations in the tree plot. Is it
possible? If yes, how?

Kind regards,
Maciej

-- 
Maciej Bliziński <m.blizinski at wit.edu.pl>
http://automatthias.wordpress.com



More information about the R-help mailing list