[R] Caret and Model Prediction

Sun Oct 5 22:51:36 CEST 2014

Thanks a lot.
At this point then I wonder: seen that my response consists of 5
outcomes for each set of features, should I then train 5 different
models (one for each of them)?
Cheers

Lorenzo

On Sun, Oct 05, 2014 at 11:04:01AM -0700, Jia Xu wrote:
>Hi, Lorenzo:
>  For 1) I think the formula is not correct. The formula should be outcome
>~ features, and that's why you have weird result in 3)
>    2) predict in caret will automatically find the best result one if
>there is one(sometimes it fails). You can print the model to see the cross
>validation result. Furthermore, you may specify the performance metric you
>want to find the optimal result. Please see the details of the caret
>tutorial to see how to.
>
>On Sun, Oct 5, 2014 at 8:54 AM, Lorenzo Isella <lorenzo.isella at gmail.com>
>wrote:
>
>> Dear All,
>> I am learning the ropes of CARET for automatic model training, more or
>> less following the steps of the tutorial at
>>
>> http://bit.ly/ZJQINa
>>
>> However, there are a few things about which I would like a piece of
>> advice.
>>
>> Consider for instance the following model
>>
>> #############################################################
>>
>> set.seed(825)
>>
>> fitControl <- trainControl(## 10-fold CV
>>                           method = "repeatedcv",
>>                           number = 10,
>>                           ## repeated ten times
>>                           repeats = 10)
>>
>> gbmGrid <-  expand.grid(interaction.depth = c(1, 5, 9),
>>                        n.trees = (1:30)*50,
>>                        shrinkage = 0.05)
>>
>> nrow(gbmGrid)
>>
>> gbmFit <- train(Ca+P+pH+SOC+Sand~ ., data = training,
>>                 method = "gbm",
>>                 trControl = fitControl,
>>                 ## This last option is actually one
>>                 ## for gbm() that passes through
>>                 verbose = TRUE,
>> ## Now specify the exact models                 ## to evaludate:
>>                 tuneGrid = gbmGrid
>>                 )
>>
>> #############################################################
>>
>> I am trying to tune a model that predicts the values of 5 columns
>> whose names are "Ca","P","pH", "SOC", and "Sand".
>>
>> 1) Am I using the formula syntax in a correct way?
>>
>> I then try to apply my model on the test data by coding
>>
>> mypred <- predict(gbmFit, newdata=test)
>>
>> However, at this point I am left with a couple of questions
>>
>> 2) does "predict" automatically select the best tuned model in gbmFit?
>> and if not, what am I supposed to do?
>> 3) I do not get any error messages, but mypred consists of a single
>> column instead of 5 columns corresponding to the 5 variables I am
>> trying to predict, so something is obviously wrong (see point 1). Any
>> suggestions here?
>>
>> Many thanks
>>
>> Lorenzo
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
>-- 
>Jia Xu