[R] comparing random forests and classification trees

Wensui Liu liuwensui at gmail.com
Mon Jan 29 02:44:16 CET 2007


Amy,
If I were you, I will check the misclassification rates in both
training set and testing set from 2 models.


On 1/28/07, Amy Koch <ajkoch at postoffice.utas.edu.au> wrote:
> Hi,
>
> I have done an analysis using 'rpart' to construct a Classification Tree. I
> am wanting to retain the output in tree form so that it is easily
> interpretable. However, I am wanting to compare the 'accuracy' of the tree
> to a Random Forest to estimate how much predictive ability is lost by using
> one simple tree. My understanding is that the error automatically displayed
> by the two functions is calculated differently so it is therefore incorrect
> to use this as a comparison. Instead I have produced a table for both
> analyses comparing the observed and predicted response.
>
> E.g. table(data$dependent,predict(model,type="class"))
>
> I am looking for confirmation that (a) it is incorrect to compare the error
> estimates for the two techniques and (b) that comparing the
> misclassification rates is an appropriate method for comparing the two
> techniques.
>
> Thanks
>
> Amy
>
>
>
>
>
> Amelia Koch
>
> University of Tasmania
>
> School of Geography and Environmental Studies
>
> Private Bag 78 Hobart
>
> Tasmania, Australia 7001
>
> Ph: +61 3 6226 7454
>
> ajkoch at utas.edu.au
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)



More information about the R-help mailing list