[R] What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model?

Liaw, Andy andy_liaw at merck.com
Tue Nov 19 13:53:09 CET 2013


The difference is importance(..., scale=TRUE).  See the help page for detail.  If you extract the $importance component from a randomForest object, you do not get the scaling.

Best,
Andy

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Lopez, Dan
Sent: Wednesday, November 13, 2013 12:16 PM
To: R help (r-help at r-project.org)
Subject: [R] What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model?

Hi R Expert Community,

My question: What is the difference between Mean Decrease Accuracy produced by importance(foo) vs foo$importance in a Random Forest Model?

I ran a Random Forest classification model where the classifier is binary. I stored the model in object FOREST_model. I than ran importance(FOREST_model) and FOREST_model$importance. I usually use the prior but decided to learn more about what is in summary(randomForest ) so I ran the latter. I expected both to produce identical output. Mean Decrease Gini is the only thing that is identical in both.

I looked at ? Random Forest and Package 'randomForest' documentation and didn't find any info explaining this difference.

I am not including a reproducible example because this is most likely something, perhaps simple, such as one  is divided by something (if so, what?), that I am just not aware of.


importance(FOREST_model)

                         HC          TER MeanDecreaseAccuracy MeanDecreaseGini
APPT_TYP_CD_LL    0.16025157 -0.521041660           0.15670297        12.793624
ORG_NAM_LL        0.20886631 -0.952057325           0.20208393       107.137049
NEW_DISCIPLINE    0.20685079 -0.960719435           0.20076762        86.495063


FOREST_model$importance


                          HC           TER MeanDecreaseAccuracy MeanDecreaseGini

APPT_TYP_CD_LL    0.0049473962 -3.727629e-03         0.0045949805        12.793624

ORG_NAM_LL        0.0090715845 -2.401016e-02         0.0077298067       107.137049

NEW_DISCIPLINE    0.0130672572 -2.656671e-02         0.0114583178        86.495063

Dan Lopez
LLNL, HRIM, Workforce Analytics & Metrics


	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}



More information about the R-help mailing list