[R] statistical significance of accuracy increase in classification

Wed Feb 25 15:42:18 CET 2009

Max,

thanks for the reply. Yes, the models are done outside R (i will see what i can do to run some of them inside R in the future ....) and the sampling is extremely skewed. But we use as truth or reference results from a field exercise where people actually went and gave detailed description of the locations visited. This was very much depended on accessibility of the site, .... which in majority is not. Unfortunately when results get reported to managers .... they do care about accuracy for example, but less about CI .... and even less about the skewed sampling .... unless i can prove that this gives unacceptable results. 

Do you know about any good reference that discusses kappa for classification and maybe CI for kappa???

Thanks again for your input,

Monica 

> Date: Wed, 25 Feb 2009 09:01:23 -0500
> Subject: Re: [R] statistical significance of accuracy increase in classification
> From: mxkuhn at gmail.com
> To: pisicandru at hotmail.com
> CC: r-help at r-project.org
> 
> Monica,
> 
> I have a few thoughts.
> 
> - (I believe) it is usually better to put confidence in these metrics
> instead of relying on p-values. The intervals will allow you to make
> inferential statements and give you a way of characterizing the
> uncertainty in the estimates. You've seen how to do this with
> accuracy. For Kappa, there is probably an analytical formula for a CI,
> but I don;t know that it is in R. I would use the bootstrap (bia the
> boot or bootstrap package) to get intervals for kappa.
> 
> - It sounds like some of the models were generated outside of R. I
> think that the sampling uncertainty can be large. In other words, if
> you were to do another training/test split, you would get different
> results so the CI for accuracy or kappa on a single test set don't
> really reflect this sampling noise. If you were doing models in R, I
> would suggest that you do many training/test splits and look at the
> distributions of those metrics.
> 
> 
> Max
_________________________________________________________________

ore_022009