[R] Random Forests: Question about R^2

Liaw, Andy andy_liaw at merck.com
Tue Apr 14 00:22:32 CEST 2009


Apologies: that should have been sum(residual^2)! 

> -----Original Message-----
> From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com] 
> Sent: Monday, April 13, 2009 4:35 PM
> To: Liaw, Andy
> Cc: R-Help List
> Subject: Re: [R] Random Forests: Question about R^2
> 
> Andy,
> thank you very much!
> One clarification question:
> 
> If MSE = sum(residuals) / n, then
> in the formula (1 - mse / Var(y)) - shouldn't one square mse before
> dividing by variance?
> 
> Dimitri
> 
> 
> On Mon, Apr 13, 2009 at 10:52 AM, Liaw, Andy 
> <andy_liaw at merck.com> wrote:
> > MSE is the mean squared residuals.  For the training data, the OOB
> > estimate is used (i.e., residual = data - OOB prediction, MSE =
> > sum(residuals) / n, OOB prediction is the mean of 
> predictions from all
> > trees for which the case is OOB).  It is _not_ the average 
> OOB MSE of
> > trees in the forest.
> >
> > I hope there's no question about how the pseudo R^2 is computed on a
> > test set?  If you understand how that's done, I assume the 
> confusion is
> > only how the OOB MSE is formed.
> >
> > Best,
> > Andy
> >
> > From: Dimitri Liakhovitski
> >>
> >> Dear Random Forests gurus,
> >>
> >> I have a question about R^2 provided by randomForest (for 
> regression).
> >> I don't succeed in finding this information.
> >>
> >> In the help file for randomForest under "Value" it says:
> >>
> >> rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
> >>
> >> Could someone please explain in somewhat more detail how 
> exactly R^2
> >> is calculated?
> >> Is "mse" mean squared error for prediction?
> >> Is "mse" an average of mse's for all trees run on out-of-bag
> >> holdout samples?
> >> In other words - is this R^2 based on out-of-bag samples?
> >>
> >> Thank you very much for clarification!
> >>
> >> --
> >> Dimitri Liakhovitski
> >> MarketTools, Inc.
> >> Dimitri.Liakhovitski at markettools.com
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> > Notice:  This e-mail message, together with any 
> attachments, contains
> > information of Merck & Co., Inc. (One Merck Drive, 
> Whitehouse Station,
> > New Jersey, USA 08889), and/or its affiliates (which may be known
> > outside the United States as Merck Frosst, Merck Sharp & Dohme or
> > MSD and in Japan, as Banyu - direct contact information for 
> affiliates is
> > available at http://www.merck.com/contact/contacts.html) that may be
> > confidential, proprietary copyrighted and/or legally 
> privileged. It is
> > intended solely for the use of the individual or entity 
> named on this
> > message. If you are not the intended recipient, and have 
> received this
> > message in error, please notify us immediately by reply e-mail and
> > then delete it from your system.
> >
> >
> 
> 
> 
> -- 
> Dimitri Liakhovitski
> MarketTools, Inc.
> Dimitri.Liakhovitski at markettools.com
> 
Notice:  This e-mail message, together with any attachme...{{dropped:12}}




More information about the R-help mailing list