[R] NAs error in caret function

javed khan j@vedbtk111 @end|ng |rom gm@||@com
Thu Apr 21 00:59:18 CEST 2022


Carlos Ortega, thank you for your answer.

Class label has three values (Bug, Codel smell and Vulnerability). X is a
text-based feature that include English statements and we performed some
preprocessing such as removing symbols, lower-case etc.

Yes, train_label is a factor class.

*I can provide the whole code and data if needed. We followed the same
method provided in this tutorial*

*https://algotech.netlify.app/blog/text-lime/
<https://algotech.netlify.app/blog/text-lime/> *


cv.folds <- createMultiFolds(train$TYPE, k = 10, times = 3)

ctrl <- trainControl(method = "cv",number=3, index = cv.folds, classProbs =
TRUE, summaryFunction = multiClassSummary)
m= train(y = train_label, x = train_x,
      method = "knn" ,
      metric = "Accuracy",
      ## #  preProc = c("center", "scale", "nzv"),
      trControl = ctrl)

p=predict(m, test_x)
confusionMatrix(p, as.factor(test_label))

With some models, it show error like: Error in { :
  task 1 failed - "Not all variable names used in object found in newdata"

However, when I run the base models like naiveBayes, it works.

model_bayes <- naiveBayes(train_x, train_label, laplace = 1)


On Wed, Apr 20, 2022 at 11:09 PM Carlos Ortega <coforfe using gmail.com> wrote:

> Hi,
>
> There are many things than could be wrong:
>
> 1. What is inside "ctrl" in the trainControl argument ?
> 2. Your model is a classication one, but if you do not configure correctly
> "ctrl" you do not get out the metrics correctly. It depends if your model
> is binary or multi-class.
> 3. Another thing is that if it is a classification one, you should also
> check that in the "train()" you "train_label" is a factor.
>
> On top of that, remember that your problem is not reproducible.
> If you attach a portion of your data, we could create a working "caret"
> code.
>
> Thanks,
> Carlos Ortega.
>
> On Wed, Apr 20, 2022 at 10:26 PM Bert Gunter <bgunter.4567 using gmail.com>
> wrote:
>
>> A quick web search on 'R caret package' found a host of useful
>> results, the first of which was this:
>> https://topepo.github.io/caret/
>> Note that the author, Max Kuhn, explicitly says there that you can
>> email him with questions. I think you should do so, as you do not seem
>> to be making progress here.
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Wed, Apr 20, 2022 at 12:51 PM javed khan <javedbtk111 using gmail.com>
>> wrote:
>> >
>> > Caret produce the error: Something is wrong; all the Accuracy metric
>> values
>> > are missing:
>> >     logLoss         AUC          prAUC        Accuracy       Kappa
>> >  Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA   Min.   : NA
>> >  1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA   1st Qu.: NA
>> >  Median : NA   Median : NA   Median : NA   Median : NA   Median : NA
>> >
>> > We (group of three) working on an assignment and could not fix this
>> error
>> > from a few days. The error comes with the majority of the models while
>> with
>> > a few model (e.g. nb), the code works. The data is text-based
>> > classification.
>> >
>> > Some Warnings are:
>> >
>> > Warning messages:
>> > 1: In train.default(y = train_label, x = train_x, method = "pls",  ... :
>> >   The metric "ROC" was not in the result set. logLoss will be used
>> instead.
>> > 2: model fit failed for Fold01.Rep1: ncomp=3 Error in
>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>> >   replacement has 320292 rows, data has 1148
>> >
>> > 3: model fit failed for Fold02.Rep1: ncomp=3 Error in
>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>> >   replacement has 320013 rows, data has 1147
>> >
>> > 4: model fit failed for Fold03.Rep1: ncomp=3 Error in
>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>> >   replacement has 320013 rows, data has 1147
>> >
>> > 5: model fit failed for Fold04.Rep1: ncomp=3 Error in
>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>> >   replacement has 320292 rows, data has 1148
>> >
>> > 6: model fit failed for Fold05.Rep1: ncomp=3 Error in
>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>> >   replacement has 320013 rows, data has 1147
>> >
>> > 7: model fit failed for Fold06.Rep1: ncomp=3 Error in
>> > `[[<-.data.frame`(`*tmp*`, i, value = structure(c(1L, 1L, 1L,  :
>> >   replacement has 320013 rows, data has 1147
>> >
>> >
>> >
>> > Code is
>> >
>> >
>> > m= train(y = train_label, x = train_x,
>> >       method = "pls" ,
>> >       metric = "Accuracy",
>> >       ## #  preProc = c("center", "scale", "nzv"),
>> >       trControl = ctrl)
>> >
>> > p=predict(m, test_x)
>> > confusionMatrix(p, as.factor(test_label))
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list