[R] replicate results of tree package

Naresh Gurbuxani n@re@h_gurbux@n| @end|ng |rom hotm@||@com
Thu Oct 13 14:31:15 CEST 2022


I am trying to understand ``deviance'' in classification tree output
from tree package.

library(tree)

set.seed(911)
mydf <- data.frame(
    name = as.factor(rep(c("A", "B"), c(10, 10))),
    x = c(rnorm(10, -1), rnorm(10, 1)),
    y = c(rnorm(10, 1), rnorm(10, -1)))

mytree <- tree(name ~ ., data = mydf)

mytree
# node), split, n, deviance, yval, (yprob)
#       * denotes terminal node

# 1) root 20 27.730 A ( 0.5 0.5 )  
#   2) y < -0.00467067 10  6.502 B ( 0.1 0.9 )  
#     4) x < 1.50596 5  5.004 B ( 0.2 0.8 ) *
#     5) x > 1.50596 5  0.000 B ( 0.0 1.0 ) *
#   3) y > -0.00467067 10  6.502 A ( 0.9 0.1 )  
#     6) x < -0.578851 5  0.000 A ( 1.0 0.0 ) *
#     7) x > -0.578851 5  5.004 A ( 0.8 0.2 ) *

# Replicate results for node 2
# Probabilities tie out
with(subset(mydf, y < -0.00457), table(name))
# name
# A B 
# 1 9

# Cannot replicate deviance = -1 * sum(p_mk * log(p_mk))
0.1 * log(0.1) + 0.9 * log(0.9)
# [1] 0.325083

1.  In the documentation, is it possible to find the definition of
deviance?
2.  Is it possible to see the code where it calculates deviance?

Thanks,
Naresh



More information about the R-help mailing list