[R] statistical significance of accuracy increase in classification

Tue Feb 24 18:48:19 CET 2009

Hi again,

Looking more into test statistics i realized that maybe i can use the power.prop.test to see if the difference between the 2 accuracies are zero or not. Do you have any comments about that? Also, should i considered kappa statistics also a kind of proportion and use the same test? If this does not violate any important hypothesis then ....

power.prop.test(n = 146, p1 = 0.7877, p2 = 0.8014, strict = TRUE)

     Two-sample comparison of proportions power calculation 
              n = 146
             p1 = 0.7877
             p2 = 0.8014
      sig.level = 0.05
          power = 0.0596356
    alternative = two.sided
 NOTE: n is number in *each* group 

which just tells that the difference in accuracies are barely different .... since the p.value = 0.06> 0.05

For Kappa statistics it will be:

power.prop.test(n = 146, p1 = 0.3675, p2 = 0.4315, strict = TRUE)

     Two-sample comparison of proportions power calculation 
              n = 146
             p1 = 0.3675
             p2 = 0.4315
      sig.level = 0.05
          power = 0.1999816
    alternative = two.sided
 NOTE: n is number in *each* group 

Any comments are really appreciated,

Monica

----------------------------------------
> From: pisicandru at hotmail.com
> To: r-help at r-project.org
> CC: max.kuhn at pfizer.com
> Subject: [R] statistical significance of accuracy increase in classification
> Date: Tue, 24 Feb 2009 16:22:41 +0000
>
>
> Hi everyone,
>
> I would like to test for the statistical significance(for what it worth ...) in increasing classification accuracy and kappa statistics from different land classifications. The classifications were done using other software (like eCognition and See5), but the results were "sampled" at locations where i have the "reference" class known. So using package "caret" i did the confusion matrix. For now i am interested in the overall results which give the overall classification accuracy and kappa statistics among others. Depending which classification i test, i have some small increase inaccuracy and a little larger increase in kappa statistics. I wonder if there is a way to do a statistical significance test for the accuracy and kappa increase between the 2 classifications.
>
> Data example and some code:
>
> library(caret)
>
> ref <- c(15, 13, 13, 13, 13, 15, 14, 14, 14, 15, 13, 13, 13, 15, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 14, 13, 13, 13, 13, 13, 13, 13, 15, 13, 13, 15, 13, 15, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,13, 14, 13, 13, 13, 13, 13, 14, 14, 15, 15, 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13, 13,13, 13, 14, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, 13, 13, 14, 13, 13, 13, 13, 13, 13, 15, 13, 13, 13, 13, 13, 13)
>
> class1 <- c(14, 14, 13, 13, 13, 15, 13, 14, 15, 14, 14, 13, 14, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 13,13, 13, 13, 13, 13, 13, 13, 13, 13, 15, 13, 14, 13, 13, 14, 13, 13, 15, 13, 13, 13, 13, 13, 13, 13,13, 13, 15, 21, 13, 15, 13, 21, 13, 13, 14, 13, 15, 13, 15, 13, 13, 14, 13, 13, 13, 13, 13, 13, 13,13, 14, 14, 13, 13, 13, 13, 15, 15, 15, 15, 13, 13, 13, 13, 13, 5, 13, 15, 13, 13, 13, 13, 13, 13,15, 13, 15, 14, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13)
>
> class2 <- c(14, 15, 13, 13, 13, 15, 13, 14, 15, 15, 14, 13, 14, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 13,13, 13, 13, 13, 13, 13, 13, 13, 13, 15, 13, 14, 13, 13, 15, 13, 13, 15, 14, 13, 13, 13, 13, 13, 13,13, 13, 15, 13, 13, 15, 13, 21, 13, 13, 13, 13, 15, 13, 15, 15, 13, 14, 13, 13, 13, 13, 13, 13, 15,13, 14, 14, 13, 13, 13, 13, 15, 14, 15, 15, 13, 14, 13, 13, 13, 15, 13, 15, 13, 13, 13, 13, 13, 13,15, 13, 15, 14, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 22, 13, 13, 13, 13, 13, 13, 13,13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13)
>
> ref1 <- factor(ref, levels = c(5, 13, 14, 15, 21, 22))
> pred1 <- factor(class1, levels = c(5, 13, 14, 15, 21, 22))
> pred2 <- factor(class2, levels = c(5, 13, 14, 15, 21, 22))
>
> t1 <- table(pred1, ref1)
> t2 <- table(pred2, ref1)
>
> cm1 <- confusionMatrix(t1)
> cm1$overall
>
> cm2 <- confusionMatrix(t2)
> cm2$overall
>
> As you see the increase in accuracy is very small, but the increase in kappa is a little bit more substantial. Is this increase statistical significant?
>
> Thanks for any help,
>
> Monica
> _________________________________________________________________

> http://windowslive.com/howitworks?ocid=TXT_TAGLM_WL_t2_hm_justgotbetter_howitworks_022009
_________________________________________________________________
It’s the same Hotmail®. If by “same” you mean up to 70% faster.