[R] using cvlm to do cross-validation

C Lin baccts at hotmail.com
Thu Mar 28 21:51:45 CET 2013


Hello,

I did a cross-validation using cvlm from DAAG package but wasn't sure how to assess the result. Does this result means my model is a good model?
I understand that the overall ms is the mean of sum of squares. But is 0.0987 a good number? The response (i.e. gailRel5yr) has min,1st Quantile, median, mean and 3rd Quantile, and max as follows: (0.462, 0.628, 0.806, 0.896, 1.000, 2.400)  
The plot generated by cvlm, the point does not look too tight. Thanks in advance


> CVlm(gailRel5yr~risk.sum,m=10)
Analysis of Variance Table

Response: gailRel5yr
          Df Sum Sq Mean Sq F value Pr(>F)    
risk.sum   1   4.19    4.19    44.8  2e-09 ***
Residuals 88   8.24    0.09                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


fold 1 
Observations in test set: 9 
                  3      7       17     27     46     66     67    83     89
risk.sum    27.2345 66.447 29.20988 33.806 28.861 20.293 29.210 1.883 12.482
cvpred       0.9693  1.607  1.00148  1.076  0.996  0.856  1.001 0.557  0.729
gailRel5yr   1.0000  1.333  1.00000  0.778  0.667  1.000  0.750 0.727  1.000
CV residual  0.0307 -0.274 -0.00148 -0.298 -0.329  0.144 -0.251 0.170  0.271

Sum of squares = 0.46    Mean square = 0.05    n = 9 

fold 2 
Observations in test set: 9 
                 5     41     42     49     51     64    69      81      84
risk.sum    28.529 24.779 28.529 16.194 47.222  8.383 5.813  1.8832 16.1937
cvpred       0.975  0.922  0.975  0.800  1.241  0.688 0.652  0.5958  0.7996
gailRel5yr   0.625  0.533  1.143  0.636  1.833  0.462 1.000  0.5385  0.7143
CV residual -0.350 -0.389  0.168 -0.163  0.592 -0.227 0.348 -0.0573 -0.0853

Sum of squares = 0.86    Mean square = 0.1    n = 9 

fold 3 
Observations in test set: 9 
                 2       8     12     25     30     47     56     74     82
risk.sum    24.043 12.5825 10.969 16.803 29.017 49.341 15.455 28.256 21.906
cvpred       0.925  0.7651  0.743  0.824  0.995  1.279  0.805  0.984  0.896
gailRel5yr   0.545  0.6923  0.571  0.500  0.714  1.857  0.714  0.667  0.500
CV residual -0.380 -0.0728 -0.171 -0.324 -0.281  0.578 -0.091 -0.318 -0.396

Sum of squares = 0.96    Mean square = 0.11    n = 9 

fold 4 
Observations in test set: 9 
                16    22   26     44     50      61      71     72     79
risk.sum    32.960 44.11 17.1 32.628 16.194  5.9823  5.9823 21.955 21.168
cvpred       1.030  1.19  0.8  1.025  0.786  0.6379  0.6379  0.870  0.858
gailRel5yr   1.667  1.57  1.0  0.500  1.000  0.6000  0.6000  0.625  1.143
CV residual  0.637  0.38  0.2 -0.525  0.214 -0.0379 -0.0379 -0.245  0.284

Sum of squares = 1.06    Mean square = 0.12    n = 9 

fold 5 
Observations in test set: 9 
                13      15      37    40     48     59     62     76    78
risk.sum    5.8134 28.5287 28.5287 5.982 29.766 45.754 10.468 28.878 1.883
cvpred      0.6144  0.9569  0.9569 0.617  0.976  1.217  0.685  0.962 0.555
gailRel5yr  0.6667  1.0000  1.0000 1.000  0.875  1.833  0.933  1.214 0.909
CV residual 0.0523  0.0431  0.0431 0.383 -0.101  0.617  0.249  0.252 0.354

Sum of squares = 0.79    Mean square = 0.09    n = 9 

fold 6 
Observations in test set: 9 
                19     32     33     55     57    68    80    86     88
risk.sum    14.719 28.529 24.043 10.468 20.293 12.48 1.883 5.813  5.982
cvpred       0.764  0.980  0.910  0.698  0.852  0.73 0.564 0.625  0.628
gailRel5yr   1.000  0.667  0.667  0.538  0.667  1.00 0.778 1.000  0.500
CV residual  0.236 -0.314 -0.243 -0.160 -0.185  0.27 0.214 0.375 -0.128

Sum of squares = 0.55    Mean square = 0.06    n = 9 

fold 7 
Observations in test set: 9 
                 20     24    36      45     52     63     65     87    90
risk.sum    35.3605 10.620 26.44  5.9823 29.766 31.074 16.194 20.293 1.883
cvpred       1.0896  0.702  0.95  0.6289  1.002  1.022  0.789  0.853 0.565
gailRel5yr   1.0000  1.000  0.50  0.6000  1.143  0.714  0.600  1.000 0.933
CV residual -0.0896  0.298 -0.45 -0.0289  0.141 -0.308 -0.189  0.147 0.369

Sum of squares = 0.61    Mean square = 0.07    n = 9 

fold 8 
Observations in test set: 9 
                18     21     23     28     38    70    73    75     77
risk.sum    25.656 26.239 49.353 16.682 9.7323 6.870 1.883 1.883 20.293
cvpred       0.943  0.953  1.337  0.794 0.6782 0.631 0.548 0.548  0.854
gailRel5yr   0.700  0.929  0.667  1.000 0.7500 0.944 0.667 0.778  0.462
CV residual -0.243 -0.024 -0.670  0.206 0.0718 0.314 0.119 0.230 -0.392

Sum of squares = 0.88    Mean square = 0.1    n = 9 

fold 9 
Observations in test set: 9 
                 6      9       34     35      39     43      54     60     85
risk.sum    46.480 29.030 16.19369 40.364 14.7192 17.826 17.8264 26.588 16.194
cvpred       1.241  0.985  0.79725  1.151  0.7757  0.821  0.8212  0.950  0.797
gailRel5yr   1.667  0.846  0.80000  1.000  0.8125  1.083  0.8333  0.556  0.533
CV residual  0.426 -0.139  0.00275 -0.151  0.0368  0.262  0.0122 -0.394 -0.264

Sum of squares = 0.52    Mean square = 0.06    n = 9 

fold 10 
Observations in test set: 9 
                 1      4    10     11     14     29     31     53    58
risk.sum    37.400 50.409 47.61 47.433 56.210 23.484 29.030 28.529 54.90
cvpred       1.065  1.224  1.19  1.188  1.296  0.894  0.962  0.956  1.28
gailRel5yr   0.909  1.667  0.90  1.650  1.444  0.600  0.545  0.571  2.40
CV residual -0.156  0.442 -0.29  0.462  0.149 -0.294 -0.416 -0.384  1.12

Sum of squares = 2.2    Mean square = 0.24    n = 9 

Overall (Sum over all 9 folds) 
    ms 
0.0987 		 	   		  


More information about the R-help mailing list