[R] Getting error message, "LOOCV is not compatible with `resamples()` since only one resampling estimate is available. "

Wed Mar 4 11:35:12 CET 2020

In response to my question, I want to confirm that the LOOCV gives me the
result (RMSE/MAE values), but when I use it as rvalues=resamples(list() ,
it gives error. All other works like boot, k fold cv and even LGOCV (Leave
Group Out Cross Validation).

Regards

On Tue, Mar 3, 2020 at 10:57 PM javed khan <javedbtk111 using gmail.com> wrote:

> I am sorry for that... I am just using caret package.
>
> Thanks
>
> On Tue, Mar 3, 2020 at 9:57 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
>> ... and you **still** have not told us what package(s)...
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>>
>> On Tue, Mar 3, 2020 at 12:28 PM javed khan <javedbtk111 using gmail.com> wrote:
>>
>>> The data is as follows:  I included the code for 10 fold CV and LOOCV
>>>
>>> structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
>>> 13, 14, 15), Language = c(1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1,
>>> 1, 1, 3), Hardware = c(1, 2, 3, 1, 2, 4, 4, 2, 1, 1, 1, 5, 6,
>>> 1, 1), Duration = c(17, 7, 15, 18, 13, 5, 5, 11, 14, 5, 13, 31,
>>> 20, 26, 14), KSLOC = c(253.6, 40.5, 450, 214.4, 449.9, 50, 43,
>>> 200, 289, 39, 254.2, 128.6, 161.4, 164.8, 60.2), AdjFP = c(1217.1,
>>> 507.3, 2306.8, 788.5, 1337.6, 421.3, 99.9, 993, 1592.9, 240,
>>> 1611, 789, 690.9, 1347.5, 1044.3), RAWFP = c(1010, 457, 2284,
>>> 881, 1583, 411, 97, 998, 1554, 250, 1603, 724, 705, 1375, 976
>>> ), EffortMM = c(287, 82.5, 1107.31, 86.9, 336.3, 84, 23.2, 130.3,
>>> 116, 72, 258.7, 230.7, 157, 246.9, 69.9)), class = "data.frame",
>>> row.names = c(NA,
>>> -15L))
>>>
>>>
>>> index <- createDataPartition(d$Effort, p = .70,list = FALSE)
>>> tr <- d[index, ]
>>> ts <- d[-index, ]
>>> index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)
>>>
>>> ct_rand <- trainControl(method = "repeatedcv", number=10,
>>> repeats=10,index = index_2, search="random")
>>> ct_grid <- trainControl(method = "repeatedcv", number=10,
>>> repeats=10,index = index_2, search="grid")
>>>
>>> ct_locv <- trainControl(method = "LOOCV",  search="random")
>>> ct_locv2 <- trainControl(method = "LOOCV",   search="grid")
>>>
>>>  ## ## ## ## ##Random Search for for 10 fold CV
>>>
>>> set.seed(30218)
>>> ran_CV <- train(Effort ~ ., data = tr,
>>>                     method = "pls",
>>>                     tuneLength = 15,
>>>                     metric = "MAE",
>>>                     preProc = c("center", "scale", "zv"),
>>>                     trControl = ct_rand)
>>> getTrainPerf(ran_CV)
>>> rn <- predict(ran_CV, newdata = ts)
>>> ##MAE(rn, ts$Effort)
>>>
>>>
>>> ## ## ## ## ##grid search  for 10 fold CV
>>>
>>> set.seed(30218)
>>> grid_CV <- train(Effort ~ ., data = tr,
>>>                      method = "pls",
>>>                      metric = "MAE",
>>>                      preProc = c("center", "scale", "zv"),
>>>                      trControl = ct_grid)
>>>
>>> getTrainPerf(grid_CV)
>>>
>>>   ## ## ## ## ##Random Search for LOOCV
>>>
>>> set.seed(30218)
>>> ran_locv <- train(Effort ~ ., data = tr,
>>>                 method = "pls",
>>>                 tuneLength = 15,
>>>                 metric = "MAE",
>>>                 preProc = c("center", "scale", "zv"),
>>>                 trControl = ct_locv)
>>> getTrainPerf(ran_locv)
>>> rn <- predict(ran_search, newdata = ts)
>>> ##MAE(rn, ts$Effort)
>>>
>>>
>>>  ## ## ## ## ##Grid Search for LOOCV
>>>
>>> set.seed(30218)
>>> grid_locv <- train(Effort ~ ., data = tr,
>>>                  method = "pls",
>>>                  metric = "MAE",
>>>                  preProc = c("center", "scale", "zv"),
>>>                  trControl = ct_locv2)
>>>
>>> getTrainPerf(grid_locv)
>>>
>>> rValues <- resamples(list(Random_Search_CV=ran_CV,
>>> Grid_Search_CV=grid_CV,
>>>
>>>                           Random_Search_LOOCV=ran_locv,
>>> Grid_Search_LOOCV=grid_locv))
>>>
>>> bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")
>>>
>>>
>>>
>>>
>>> On Tue, Mar 3, 2020 at 5:07 PM Bert Gunter <bgunter.4567 using gmail.com>
>>> wrote:
>>>
>>>> 2 1/2 suggestions:
>>>>
>>>> 1. Provide a small reproducible example with **minimal code** . It can
>>>> be difficult to sort through dozens of lines of code, and I, anyway, would
>>>> be unwilling to spend time trying to debug/isolate the problem when you
>>>> have apparently not made much of an effort to do so yourself. Others may
>>>> well be both more knowledgeable and more tolerant, of course.
>>>>
>>>> 2. If, **after a suitable wait ** you have not received useful answers,
>>>> contact the package maintainer of the package you used **which you have
>>>> again failed to identify** (the caret package?) . Also check to see whether
>>>> the package has its own user support structure. Some do, and this should be
>>>> the first point of contact anyway if so.
>>>>
>>>> 2 1/2 . Post in **plain text** not html, though I don't think it
>>>> mattered here.
>>>>
>>>>
>>>> Bert Gunter
>>>>
>>>> "The trouble with having an open mind is that people keep coming along
>>>> and sticking things into it."
>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>
>>>>
>>>> On Tue, Mar 3, 2020 at 3:30 AM javed khan <javedbtk111 using gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, I am using different validation methods for random search and grid
>>>>> search. The validation methods are 10 fold CV, bootstrap and LOOCV but
>>>>> for
>>>>> LOOCV, I get the error message when I draw boxplots for all the
>>>>> results.
>>>>>
>>>>> Error is , LOOCV is not compatible with `resamples()` since only one
>>>>> resampling estimate is available.
>>>>>
>>>>> The code is below.
>>>>>
>>>>> d=readARFF("china.arff")
>>>>> index <- createDataPartition(d$Effort, p = .70,list = FALSE)
>>>>> tr <- d[index, ]
>>>>> ts <- d[-index, ]
>>>>> index_2 <- createFolds(tr$Effort, returnTrain = TRUE, list = TRUE)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ct_rand <- trainControl(method = "repeatedcv", number=10,
>>>>> repeats=10,index
>>>>> = index_2, search="random")
>>>>> ct_grid <- trainControl(method = "repeatedcv", number=10,
>>>>> repeats=10,index
>>>>> = index_2, search="grid")
>>>>>
>>>>>
>>>>> ct_boot1 <- trainControl(method = "boot", number=100,  index = index_2,
>>>>> search="random")
>>>>> ct_boot2 <- trainControl(method = "boot", number=100,  index = index_2,
>>>>> search="grid")
>>>>>
>>>>> ct_locv <- trainControl(method = "LOOCV",  search="random")
>>>>> ct_locv2 <- trainControl(method = "LOOCV",   search="grid")
>>>>>
>>>>> set.seed(30218)
>>>>> ran_CV <- train(Effort ~ ., data = tr,
>>>>>                     method = "pls",
>>>>>                     tuneLength = 15,
>>>>>                     metric = "MAE",
>>>>>                     preProc = c("center", "scale", "zv"),
>>>>>                     trControl = ct_rand)
>>>>> getTrainPerf(ran_CV)
>>>>> rn <- predict(ran_CV, newdata = ts)
>>>>>
>>>>> ## ## ## ## ##grid search CV
>>>>>
>>>>> set.seed(30218)
>>>>> grid_CV <- train(Effort ~ ., data = tr,
>>>>>                      method = "pls",
>>>>>                      metric = "MAE",
>>>>>                      preProc = c("center", "scale", "zv"),
>>>>>                      trControl = ct_grid)
>>>>>
>>>>> getTrainPerf(grid_CV)
>>>>>
>>>>> set.seed(30218)
>>>>> ran_boot <- train(Effort ~ ., data = tr,
>>>>>                 method = "pls",
>>>>>                 tuneLength = 15,
>>>>>                 metric = "MAE",
>>>>>                 preProc = c("center", "scale", "zv"),
>>>>>                 trControl = ct_boot1)
>>>>> getTrainPerf(ran_boot)
>>>>> rn <- predict(ran_search, newdata = ts)
>>>>> ##MAE(rn, ts$Effort)
>>>>>
>>>>>
>>>>> ## ## ## ## ##grid search boot
>>>>>
>>>>> set.seed(30218)
>>>>> grid_boot <- train(Effort ~ ., data = tr,
>>>>>                  method = "pls",
>>>>>                  metric = "MAE",
>>>>>                  preProc = c("center", "scale", "zv"),
>>>>>                  trControl = ct_boot2)
>>>>>
>>>>> getTrainPerf(grid_boot)
>>>>>
>>>>>
>>>>> set.seed(30218)
>>>>> ran_locv <- train(Effort ~ ., data = tr,
>>>>>                 method = "pls",
>>>>>                 tuneLength = 15,
>>>>>                 metric = "MAE",
>>>>>                 preProc = c("center", "scale", "zv"),
>>>>>                 trControl = ct_locv)
>>>>> getTrainPerf(ran_locv)
>>>>> rn <- predict(ran_search, newdata = ts)
>>>>> ##MAE(rn, ts$Effort)
>>>>>
>>>>>
>>>>> ## ## ## ## ##grid search CV
>>>>>
>>>>> set.seed(30218)
>>>>> grid_locv <- train(Effort ~ ., data = tr,
>>>>>                  method = "pls",
>>>>>                  metric = "MAE",
>>>>>                  preProc = c("center", "scale", "zv"),
>>>>>                  trControl = ct_locv2)
>>>>>
>>>>> getTrainPerf(grid_locv)
>>>>>
>>>>>
>>>>> rValues <- resamples(list(Random_Search_CV=ran_CV,
>>>>> Grid_Search_CV=grid_CV,
>>>>> Random_Search_Boot=ran_boot, Grid_Search_Boot=grid_boot ,
>>>>>                           Random_Search_LOOCV=ran_locv,
>>>>> Grid_Search_LOOCV=grid_locv))
>>>>>
>>>>> bwplot(rValues,metric="MAE", scales=list(cex=1), col="Green")
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>

	[[alternative HTML version deleted]]