[R] interpret the importance output?

Liaw, Andy andy_liaw at merck.com
Wed Aug 29 15:03:33 CEST 2012


The "type=1" importance measure in RF compares the prediction error of each tree on the OOB data with the prediction error of the same tree on the OOB data with the values of one variable randomly shuffled.  If the variable has no predictive power, then the two should be very close, and there's 50% chance that the difference is negative.  If the variable is "important", then shuffling the values should significantly degrade the prediction in the form of increased MSE.  The importance measure takes mean of the differences of all these individual tree MSEs and then divide by the SD of these differences.

With that, I hope it's clear that only v2 and v4 in your example are potentially "important".

Best,
Andy

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Johnathan Mercer
Sent: Monday, August 27, 2012 11:40 AM
To: r-help at stat.math.ethz.ch
Subject: [R] interpret the importance output?

> importance(rfor.pdp11_t25.comb1,type=1)
          %IncMSE
v1 -0.28956401263
v2  1.92865561147
v3 -0.63443929130
v4  1.58949137047
v5  0.03190940065

I wasn't entirely confident with interpreting these results based on the
documentation.
Could you please interpret?

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Notice:  This e-mail message, together with any attachme...{{dropped:11}}




More information about the R-help mailing list