[R] Caret and Model Prediction

Lorenzo Isella lorenzo.isella at gmail.com
Sun Oct 5 17:54:59 CEST 2014


Dear All,
I am learning the ropes of CARET for automatic model training, more or
less following the steps of the tutorial at

http://bit.ly/ZJQINa

However, there are a few things about which I would like a piece of
advice.

Consider for instance the following model

#############################################################

 set.seed(825)

fitControl <- trainControl(## 10-fold CV
                           method = "repeatedcv",
                           number = 10,
                           ## repeated ten times
                           repeats = 10)

gbmGrid <-  expand.grid(interaction.depth = c(1, 5, 9),
                        n.trees = (1:30)*50,
                        shrinkage = 0.05)

nrow(gbmGrid)

gbmFit <- train(Ca+P+pH+SOC+Sand~ ., data = training,
                 method = "gbm",
                 trControl = fitControl,
                 ## This last option is actually one
                 ## for gbm() that passes through
                 verbose = TRUE,
## Now specify the exact models 
                 ## to evaludate:
                 tuneGrid = gbmGrid
                 )

#############################################################

I am trying to tune a model that predicts the values of 5 columns
whose names are "Ca","P","pH", "SOC", and "Sand".

1) Am I using the formula syntax in a correct way?

I then try to apply my model on the test data by coding

mypred <- predict(gbmFit, newdata=test)

However, at this point I am left with a couple of questions

2) does "predict" automatically select the best tuned model in gbmFit?
and if not, what am I supposed to do?
3) I do not get any error messages, but mypred consists of a single
column instead of 5 columns corresponding to the 5 variables I am
trying to predict, so something is obviously wrong (see point 1). Any
suggestions here?

Many thanks

Lorenzo



More information about the R-help mailing list