[R] Couple of Questions about Classification trees

Jen_mp3 Jen_mp3 at msn.com
Wed Mar 11 21:53:46 CET 2009



Okay perhaps I should've been more clear about the data. Im actually working
on spectroscopic measurements from food authenticity testing. I have five
different types of meat: 55 of chicken, 55 of turkey, 55 of pork, 34 of beef
and 32 of lamb - 231 in total. On each of these 231 meats, 1024
spectroscopic measurements were taken. Matrix of 231 by 1024. But the
questions I want answered are which of the 1024 measurements are important
for predicting meat type and which of the different types of meat are
incorrectly classified - i.e can we tell the difference between chicken and
turkey. So to carry out a multivariate analysis on the data Ive split it
into two. A training data set and a test data set - half and half although I
think the larger half (55 goes into 27 and 28) went into the test data set
which explains the inequalities in the row numbers. By the way 1024 is
standard - can't change that. Can't change the 231 either. 

So I created a new row with the meat types for each row. 

End up with the following R code:
library(tree)
meat.tree <- tree(meat.type~., data=train)
using tree.cv (or cv.tree) lowest missclassification rate is 5 so cut the
number of nodes down to 5 using prune.tree
prunedtree <- prune.tree(meat.tree, best = 5, method = "misclass")
Then I want to use predict.tree and the test data set. 
predicttree <- predict.tree(prunedtree, data = test)
I already said what it produces. 

Again, how would I display the misclassification rate at each node on the
diagram? I know about misclass.tree(prunedtree, detail = TRUE) but that
doesn't actually display them on the classification tree - it just gives a
bunch of numbers of the worksheet and it just wouldn't look very neat if I
had to add them later. 

-- 
View this message in context: http://www.nabble.com/Couple-of-Questions-about-Classification-trees-tp22461673p22464302.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list