[R] random forest question

Liaw, Andy andy_liaw at merck.com
Tue Jan 20 14:12:28 CET 2004


The classwt are used in the gini index for splitting nodes.  What we have
found (about two years ago) is that that option does not affect the
prediction as much as one would expect.  I suspect the problem is because
the trees are grown to maximum sizes and not pruned back.  This is why I
implemented the cutoff and sampsize options in randomForest().  Do make use
of them.  The classwt is there just for `old time sake', I guess...

BTW, 4.0-7 is current, and fixes a few bugs in 4.0-1.

BTW #2:  The convention is to direct questions specific to a package to the
package maintainer (me in this case) first, before posting to R-help.

HTH,
Andy

> From: Christian Hennig
> 
> Hi,
> 
> here are three results of random forest (version 4.0-1).
> The results seem to be more or less the same which is strange 
> because I
> changed the classwt. 
> I hoped that for example classwt=c(0.45,0.1,0.45) would 
> result in fewer
> cases classified as class 2. Did I understand something wrong?
> 
> Christian
> 
> x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
>                      y=as.factor(traingroups),
>                      xtest=as.data.frame(mfilters[cvtest,]),
>                      ytest=as.factor(testgroups))
> > x1rf$test$confusion
>      1    2  3 class.error
> 1 9954   30 19  0.00489853
> 2  139 1854  0  0.06974410
> 3  420    0 84  0.83333333
> x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
>                      y=as.factor(traingroups),
>                      xtest=as.data.frame(mfilters[cvtest,]),
>                      
> ytest=as.factor(testgroups),classwt=c(0.45,0.1,0.45))
> > x1rf$test$confusion
>      1    2  3 class.error
> 1 9952   31 20  0.00509847
> 2  164 1828  1  0.08278976
> 3  440    0 64  0.87301587
> x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]),
>                      y=as.factor(traingroups),
>                      xtest=as.data.frame(mfilters[cvtest,]),
> 
> ytest=as.factor(testgroups),classwt=c(0.49,0.02,0.49))
> > x1rf$test$confusion
>      1    2  3 class.error
> 1 9948   35 20  0.00549835
> 2  170 1823  0  0.08529854
> 3  439    0 65  0.87103175
> 
> 
> 
> **************************************************************
> *********
> Christian Hennig
> Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
> hennig at math.uni-hamburg.de, 
> http://www.math.uni-hamburg.de/home/hennig/
> 
> ##############################################################
> #########
> ich empfehle www.boag-online.de


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}




More information about the R-help mailing list