[R] statistical significance of accuracy increase in classification

Wed Feb 25 15:01:23 CET 2009

Monica,

I have a few thoughts.

 - (I believe) it is usually better to put confidence in these metrics
instead of relying on p-values. The intervals will allow you to make
inferential statements and give you a way of characterizing the
uncertainty in the estimates. You've seen how to do this with
accuracy. For Kappa, there is probably an analytical formula for a CI,
but I don;t know that it is in R. I would use the bootstrap (bia the
boot or bootstrap package) to get intervals for kappa.

 - It sounds like some of the models were generated outside of R. I
think that the sampling uncertainty can be large. In other words, if
you were to do another training/test split, you would get different
results so the CI for accuracy or kappa on a single test set don't
really reflect this sampling noise. If you were doing models in R, I
would suggest that you do many training/test splits and look at the
distributions of those metrics.

Max