[R] Caret and Model Prediction

mxkuhn mxkuhn at gmail.com
Mon Oct 6 02:26:53 CEST 2014



> On Oct 5, 2014, at 4:51 PM, Lorenzo Isella <lorenzo.isella at gmail.com> wrote:
> 
> Thanks a lot.
> At this point then I wonder: seen that my response consists of 5
> outcomes for each set of features, should I then train 5 different
> models (one for each of them)?
> Cheers

caret can only model one outcome at a time so yes. 

Max

> Lorenzo
> 
>> On Sun, Oct 05, 2014 at 11:04:01AM -0700, Jia Xu wrote:
>> Hi, Lorenzo:
>> For 1) I think the formula is not correct. The formula should be outcome
>> ~ features, and that's why you have weird result in 3)
>>   2) predict in caret will automatically find the best result one if
>> there is one(sometimes it fails). You can print the model to see the cross
>> validation result. Furthermore, you may specify the performance metric you
>> want to find the optimal result. Please see the details of the caret
>> tutorial to see how to.
>> 
>> On Sun, Oct 5, 2014 at 8:54 AM, Lorenzo Isella <lorenzo.isella at gmail.com>
>> wrote:
>> 
>>> Dear All,
>>> I am learning the ropes of CARET for automatic model training, more or
>>> less following the steps of the tutorial at
>>> 
>>> http://bit.ly/ZJQINa
>>> 
>>> However, there are a few things about which I would like a piece of
>>> advice.
>>> 
>>> Consider for instance the following model
>>> 
>>> #############################################################
>>> 
>>> set.seed(825)
>>> 
>>> fitControl <- trainControl(## 10-fold CV
>>>                          method = "repeatedcv",
>>>                          number = 10,
>>>                          ## repeated ten times
>>>                          repeats = 10)
>>> 
>>> gbmGrid <-  expand.grid(interaction.depth = c(1, 5, 9),
>>>                       n.trees = (1:30)*50,
>>>                       shrinkage = 0.05)
>>> 
>>> nrow(gbmGrid)
>>> 
>>> gbmFit <- train(Ca+P+pH+SOC+Sand~ ., data = training,
>>>                method = "gbm",
>>>                trControl = fitControl,
>>>                ## This last option is actually one
>>>                ## for gbm() that passes through
>>>                verbose = TRUE,
>>> ## Now specify the exact models                 ## to evaludate:
>>>                tuneGrid = gbmGrid
>>>                )
>>> 
>>> #############################################################
>>> 
>>> I am trying to tune a model that predicts the values of 5 columns
>>> whose names are "Ca","P","pH", "SOC", and "Sand".
>>> 
>>> 1) Am I using the formula syntax in a correct way?
>>> 
>>> I then try to apply my model on the test data by coding
>>> 
>>> mypred <- predict(gbmFit, newdata=test)
>>> 
>>> However, at this point I am left with a couple of questions
>>> 
>>> 2) does "predict" automatically select the best tuned model in gbmFit?
>>> and if not, what am I supposed to do?
>>> 3) I do not get any error messages, but mypred consists of a single
>>> column instead of 5 columns corresponding to the 5 variables I am
>>> trying to predict, so something is obviously wrong (see point 1). Any
>>> suggestions here?
>>> 
>>> Many thanks
>>> 
>>> Lorenzo
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> 
>> 
>> -- 
>> Jia Xu
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list