[R] Corrupt trees

Osei Poku opoku at ece.cmu.edu
Wed Mar 23 19:29:15 CET 2011


Hi Everyone,

I have been using the "tree" package for a while with no problems until now.

When I run predict(tree, newdata), I get an error with the message "Corrupt tree" for about 50% of the trees that I generate with tree. For other trees, the predict function completes with no errors. 

I haven't identified a correlation between the corrupt trees and the working tree. It is the same code with the same data set. The only difference is that the training data is sampled randomly from the complete data set.


The "copy-pasted" code below illustrates:

> data <- load.data()
> mytree <- run.tree(data)                                                                                                                        
> mypred <- gen.predictions(mytree, data )                                                                                                                      
Error in pred1.tree(object, tree.matrix(newdata)) : corrupt tree                                                                                                                  
> mytree <- run.tree(data)                                                                                                                        
> mypred <- gen.predictions(mytree,data )                                                                                                                      
 (no error here)
> 


The function run.tree() does the following:
1. Split the data into training and test
2. Call tree() to generate a tree object
3. Returns the indices of the training, test in the data as well as the tree object

The function gen.predictions() does the following:
1. Call predict() with the tree object from run.tree() and the test data
2. Returns the yprob values and other information about the predictions

Any help would be greatly appreciated.

Osei


More information about the R-help mailing list