[R] randomForest: predictor importance (for regressions)

Liaw, Andy andy_liaw at merck.com
Thu May 6 14:37:28 CEST 2010


See reply inline below. 

Andy

From: Dimitri Liakhovitski
> 
> I have a question about predictor importances in randomForest.
> 
> Once I've run randomForest and got my object, I get their importances:
> rfresult$importance
> I also get the "standard errors" of the permutation-based importance
> measure: rfresult$importanceSD
> 
> I have 2 questions:
> 
> 1. Because I am dealing with regressions, I am getting an 
> importance object
> (rfresult$importance) with two columns, labeled "%IncMSE" 
> (the first column)
> and "IncNodePurity" (the second column). I assume it's the 
> first one that is
> the mean decrease in accuracy due to permutation. Am I correct or am I
> wrong? I am confused because ?randomForest says: "or 
> Regression, the first
> column is the mean decrease in accuracy and the second the 
> mean decrease in
> MSE." - but it is the first column, not the second that has 
> "MSE" in its
> header.

In regression trees, node impurity is measured by MSE, therefore the
second measure that averages cumulative reduction in node impurity due
to splits by a variable over all trees is labelled as "mean decrease in
MSE".
 
> 2. According to this thread (
> http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg94873.
> html), The
> overall importance measure is mean(d[i]) / se(d[i]), where se(d[i]) is
> sd(d[i])/sqrt(ntree) (the "standard error").
> So, in order to get at the importance of predictors (and I 
> want to use the
> permutation-based importance) - should I just take the first column of
> rfresult$importance or should I first divide rfresult$importance by
> rfresult$importanceSD - to get something analogous to z-scores and use
> those?

See the "scale" argument in ?importance.  The recommended way of
extracting components of an object in R is to use the extractor
functions when they exist.
 
> Thank you very much!
> 
> -- 
> Dimitri Liakhovitski
> Ninah.com
> Dimitri.Liakhovitski at ninah.com
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}



More information about the R-help mailing list