[R] Caret and Model Prediction

Jia Xu dolremi at gmail.com
Sun Oct 5 20:04:01 CEST 2014


Hi, Lorenzo:
  For 1) I think the formula is not correct. The formula should be outcome
~ features, and that's why you have weird result in 3)
    2) predict in caret will automatically find the best result one if
there is one(sometimes it fails). You can print the model to see the cross
validation result. Furthermore, you may specify the performance metric you
want to find the optimal result. Please see the details of the caret
tutorial to see how to.

On Sun, Oct 5, 2014 at 8:54 AM, Lorenzo Isella <lorenzo.isella at gmail.com>
wrote:

> Dear All,
> I am learning the ropes of CARET for automatic model training, more or
> less following the steps of the tutorial at
>
> http://bit.ly/ZJQINa
>
> However, there are a few things about which I would like a piece of
> advice.
>
> Consider for instance the following model
>
> #############################################################
>
> set.seed(825)
>
> fitControl <- trainControl(## 10-fold CV
>                           method = "repeatedcv",
>                           number = 10,
>                           ## repeated ten times
>                           repeats = 10)
>
> gbmGrid <-  expand.grid(interaction.depth = c(1, 5, 9),
>                        n.trees = (1:30)*50,
>                        shrinkage = 0.05)
>
> nrow(gbmGrid)
>
> gbmFit <- train(Ca+P+pH+SOC+Sand~ ., data = training,
>                 method = "gbm",
>                 trControl = fitControl,
>                 ## This last option is actually one
>                 ## for gbm() that passes through
>                 verbose = TRUE,
> ## Now specify the exact models                 ## to evaludate:
>                 tuneGrid = gbmGrid
>                 )
>
> #############################################################
>
> I am trying to tune a model that predicts the values of 5 columns
> whose names are "Ca","P","pH", "SOC", and "Sand".
>
> 1) Am I using the formula syntax in a correct way?
>
> I then try to apply my model on the test data by coding
>
> mypred <- predict(gbmFit, newdata=test)
>
> However, at this point I am left with a couple of questions
>
> 2) does "predict" automatically select the best tuned model in gbmFit?
> and if not, what am I supposed to do?
> 3) I do not get any error messages, but mypred consists of a single
> column instead of 5 columns corresponding to the 5 variables I am
> trying to predict, so something is obviously wrong (see point 1). Any
> suggestions here?
>
> Many thanks
>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jia Xu

	[[alternative HTML version deleted]]



More information about the R-help mailing list