[R] Cross-validation accuracy in SVM

Thu Jan 20 22:22:37 CET 2005

Ton van Daelen wrote:
> Hi all -
> 
> I am trying to tune an SVM model by optimizing the cross-validation
> accuracy. Maximizing this value doesn't necessarily seem to minimize the
> number of misclassifications. Can anyone tell me how the
> cross-validation accuracy is defined? In the output below, for example,
> cross-validation accuracy is 92.2%, while the number of correctly
> classified samples is (1476+170)/(1476+170+4) = 99.7% !?
> 
> Thanks for any help.
> 
> Regards - Ton

Percent correctly classified is an improper scoring rule.  The percent 
is maximized when the predicted values are bogus.  In addition, one can 
add a very important predictor and have the % actually decrease.

Frank Harrell

> 
> ---
> Parameters:
>    SVM-Type:  C-classification 
>  SVM-Kernel:  radial 
>        cost:  8 
>       gamma:  0.007 
> 
> Number of Support Vectors:  1015
> 
>  ( 148 867 )
> 
> Number of Classes:  2 
> 
> Levels: 
>  false true
> 
> 5-fold cross-validation on training data:
> 
> Total Accuracy: 92.24242 
> Single Accuracies:
>  90 93.33333 94.84848 92.72727 90.30303 
> 
> Contingency Table
>            predclasses
> origclasses false true
>       false 1476     0
>       true     4   170
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University