[R] comparing random forests and classification trees

Darin A. England england at cs.umn.edu
Tue Jan 30 20:32:07 CET 2007


I have also had this issue with randomForest, that is, you lose the
ability to explain the classifier in a simple way to
non-specialists (everyone can understand the single decision tree.)
As far as comparing the accuracy of the two, I think that you are
correct in comparing them by the actual vs predicted tables.
randomForest reports this as the confusion matrix, and it also
reports the out-of-bag error, which I think you are referring to. I
would not compare the rf out-of-bag error with the rpart relative
error (or cross-validated error if you are doing cross validation.)

So, for what it's worth I think you are correct. Also, do you know
about ctree in the "party" package? If you want to retain the
explanatory power of a single tree and have a nice accurate
classifier, I have found ctree to work quite well.



On Mon, Jan 29, 2007 at 11:34:51AM +1100, Amy Koch wrote:
> Hi,
> I have done an analysis using 'rpart' to construct a Classification Tree. I
> am wanting to retain the output in tree form so that it is easily
> interpretable. However, I am wanting to compare the 'accuracy' of the tree
> to a Random Forest to estimate how much predictive ability is lost by using
> one simple tree. My understanding is that the error automatically displayed
> by the two functions is calculated differently so it is therefore incorrect
> to use this as a comparison. Instead I have produced a table for both
> analyses comparing the observed and predicted response. 
> E.g. table(data$dependent,predict(model,type="class"))
> I am looking for confirmation that (a) it is incorrect to compare the error
> estimates for the two techniques and (b) that comparing the
> misclassification rates is an appropriate method for comparing the two
> techniques.
> Thanks
> Amy
> Amelia Koch
> University of Tasmania
> School of Geography and Environmental Studies
> Private Bag 78 Hobart
> Tasmania, Australia 7001
> Ph: +61 3 6226 7454
> ajkoch at utas.edu.au
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list