[R] Classification Trees
Vladimir N. Kutinsky
kutinskyv at obninsk.com
Sat Jun 9 15:23:07 CEST 2001
I apologize if you receive multiple copies of this letter. This is the first time I've written to this mailing list, so please be kind:-)
I'm trying to make a programme which grows a classification tree. I use APL programming language and I use R to compare and test results.
I have a classification tree and I have a sequence of cost-comlexity parameters(alphas): 0,A1,A2...An. Now I want to choose a right-sized tree or, in other words, the optimal complexity parameter Ak. I understand that I should use a V-fold cross validation. The problem is that I don't quite understand how to prune trees in CV:
1. If I use the initial sequence of alphas:
To test A1 I snip off all rooted nodes with cost-complexity parameters in a range [0, A1]; to test A2 I prune all nodes with cost-complexity parameters in a range [A1, A2]; ...etc. Is this correct?
2. If I use a new sequence of complexity parameters 0,B1,B2,...,Bm, where Bi is the geometric mean of A[i] and A[i+1], Bi=SQRT( A[i] * A[i+1] ):
Suppose, I select Bk as an optimal parameter. Which Ai does this optimal Bk correspond to?
Which of the two ways should I follow? Are there any other ways of choosing a right-sized tree? Does anybody have any ideas?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the R-help