[R] How to calculate the generalization error of random forests?

Martin Lam tmlammail at yahoo.com
Thu Feb 9 17:18:50 CET 2006


Hi,

Perhaps this is not the proper place to ask this
question but I am out of options, therefore I
apologize in advance.

I want to know how the (upper bound?) generalization
error of the random forest is determined using the
out-of-bag estimate. I read in Breiman's paper that s
and p determine the generalization error:
p(1-s^2)/s^2.
Does s stands for the strength of the individual tree
or of the entire ensemble? p stands for the
correlation between the trees.

If I have, let's say, built 3 trees in my forest and I
know for each tree the instances that were left out
during training, how do I calculate s and p, so I can
calculate the error?

Thanks in advance,

Martin




More information about the R-help mailing list