[R] how to evaluate the significance of attributes in tree gr owing

Liaw, Andy andy_liaw at merck.com
Thu Jan 27 02:42:51 CET 2005


FWIW, I wrote a little function to extract variable importance as defined in
the CART book a while ago.  It's rather limited:  Only works for regression
problem, and you need to set maxsurrogate=0 and maxcompete=0.  It may (or
may not) help you:

varimp.rpart <- function(x) {
    dev <- x$frame[, c("var", "dev")]
    dev <- dev[dev$var != "<leaf>", ]
    improve <- x$split[, "improve"]
    imp <- tapply(dev[, 2] * improve, dev$var, sum)[-1]
    if (any(is.na(imp))) 
        imp[is.na(imp)] <- 0
    imp
}

Here's an example using the Boston housing data:

> library(rpart)
> data(Boston, package="MASS")
> boston.rp <- rpart(medv ~ ., Boston, control=rpart.control(maxsurrogate=0,
maxcompete=0))
> varimp.rpart(boston.rp)
     crim        zn     indus      chas       nox        rm       age
dis 
 1136.809     0.000     0.000     0.000     0.000 23825.922     0.000
1544.804 
      rad       tax   ptratio     black     lstat 
    0.000     0.000     0.000     0.000  7988.955 

Both gbm and randomForest has analogous measures.

Andy


> From: WeiWei Shi
> 
> Hi, there:
> 
> I am wondering if there is a package in R (doing decison trees) which
> can provide some methods to evaluate the significance of attributes. I
> remembered randomForest gives some output like that. Unfortunately my
> current computing env. cannot handle my datasets if I use
> randomForest. So, I am thinking if other packages can do this job or
> not.
> 
> 
> Thanks,
> 
> Ed
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list