[R] Modify rpart

Sutton-Charani nicolas.sutton-charani at hds.utc.fr
Mon Apr 18 14:51:19 CEST 2011


Hello every one,

I am a PhD student in Statistics, for my work I had to modify the rpart code
and use it to build some decision trees.
I thought I managed, but I noticed some strangeness in the trees I got by
using the modified rpart.
I'd like to ask you if I did the right modification:

In fact in rpart it is the gini measure that I would like to modify:

as far as I know the gini measure is of the form gini(t) =1 -
sum(i=1:n)[Fi(t)] with Fi(t)=Ni(t)/N(t)=p(t)

I wanted to replace this measure by m(t) =
1-0.5sum(i=1:n)[Fi(t)log2(Fi(t)+1)]

When I look into the rpart package, in the src, in gini.c, I found 

static double gini_impure1(p) double p; {  return(1 - p*p); }

which I replaced by 

static double gini_impure1(p) double p; {  return(1-0.5*p*log2(p+1)); }

am I right?

Thank you 

Nicolas


--
View this message in context: http://r.789695.n4.nabble.com/Modify-rpart-tp3457430p3457430.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list