[R] predict.tree

ande8047 at umn.edu ande8047 at umn.edu
Wed May 9 19:17:01 CEST 2007


I have a classification tree model similar to the following (slightly 
simplified here):

> treemod<-tree(y~x)

where y is a factor and x is a matrix of numeric predictors. They have 
dimensions:

> length(y)
[1] 1163
> dim(x)
[1] 1163   75

I’ve evaluated the tree model and am happy with the fit. I also have a 
matrix of cases that I want to use the tree model to classify. Call it 
newx:

> dim(newx)
[1] 68842    75

The column names of newx match the column names of x. It seems that 
prediction should be straightforward. To classify the first 10 values of 
newx, for example, I think I should use:

> predict(treemod, newx[1:10,], type = "class")

However, this returns a vector of the predicted classes of the training 
data x, rather than the predicted classes of the new data. The returned 
vector has length 1163, not length 10. This occurs regardless of the number 
of rows in newx. It gives this warning message:

'newdata' had 10 rows but variable(s) found have 1163 rows

I must be misunderstanding the way I should format the newdata I pass to 
predict. I’ve tried the rpart package as well, but have a similar problem. 
What am I missing?

Thanks in advance,

Ryan Anderson
Graduate Student
Dept. of Forest Resources
University of Minnesota
ande8047 at umn.edu



More information about the R-help mailing list