[R] Validation / Training - test data

Sam Sam_Smith at me.com
Wed Sep 29 10:25:58 CEST 2010


Dear List,

I have developed two models i want to use to predict a response, one with a binary response and one with a ordinal response.

My original plan was to divide the data into test (300 entries) and training (1000 entries) and check the power of the model by looking at the % correct predictions. However i have been told my a colleague  that 1300 entries is far too little to partition the data set and i should use the whole data set, and determine the power of the model with scores such as c-value and Brier score and use bootstrapping.

I understand how to bootstrap in R however i have never used it on predicted values.

My questions are -

1. Using the boot() command how do i use this to test the power of my predictive model?
2. Is it possible to bootstrap brier score or is this not necessary?
3. ( This is a separate point i am struggling with, i thought i would include it here instead of posting again!) I have selected the most likely model with AIC criteria from a set of candidate GLMM models, however as GLMM has no predict function i have used the best model and excluded the random effects and ran it as a glm and used the predict function from here - is this OK?

Thanks

Sam



More information about the R-help mailing list