[R] Random Forests: Question about R^2

Liaw, Andy andy_liaw at merck.com
Mon Apr 13 16:52:40 CEST 2009


MSE is the mean squared residuals.  For the training data, the OOB
estimate is used (i.e., residual = data - OOB prediction, MSE =
sum(residuals) / n, OOB prediction is the mean of predictions from all
trees for which the case is OOB).  It is _not_ the average OOB MSE of
trees in the forest.

I hope there's no question about how the pseudo R^2 is computed on a
test set?  If you understand how that's done, I assume the confusion is
only how the OOB MSE is formed.

Best,
Andy

From: Dimitri Liakhovitski
> 
> Dear Random Forests gurus,
> 
> I have a question about R^2 provided by randomForest (for regression).
> I don't succeed in finding this information.
> 
> In the help file for randomForest under "Value" it says:
> 
> rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
> 
> Could someone please explain in somewhat more detail how exactly R^2
> is calculated?
> Is "mse" mean squared error for prediction?
> Is "mse" an average of mse's for all trees run on out-of-bag 
> holdout samples?
> In other words - is this R^2 based on out-of-bag samples?
> 
> Thank you very much for clarification!
> 
> -- 
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}




More information about the R-help mailing list