[R] Custom caret metric based on prob-predictions/rankings

Yang Zhang yanghatespam at gmail.com
Fri Feb 10 21:00:57 CET 2012


(I couldn't find answers to this question in the documentation)

On Fri, Feb 10, 2012 at 11:59 AM, Yang Zhang <yanghatespam at gmail.com> wrote:
> Sorry for not being more clear - I'm interested in accessing these
> indices from within the trainControl summaryFunction, not afterward
> (from the train object).
>
> As for the weights, I'm referring to the weights argument passed into
> train.
>
> On Fri, Feb 10, 2012 at 5:50 AM, Max Kuhn <mxkuhn at gmail.com> wrote:
>> I think you need to read the man pages and the four vignettes. A lot
>> of your questions have answers there.
>>
>> If you don't specify the resampling indices, they ones generated for
>> you are saved in the train object:
>>
>>> data(iris)
>>> TrainData <- iris[,1:4]
>>> TrainClasses <- iris[,5]
>>>
>>> knnFit1 <- train(TrainData, TrainClasses,
>> +                  method = "knn",
>> +                  preProcess = c("center", "scale"),
>> +                  tuneLength = 10,
>> +                  trControl = trainControl(method = "cv"))
>> Loading required package: class
>>
>> Attaching package: ‘class’
>>
>> The following object(s) are masked from ‘package:reshape’:
>>
>>    condense
>>
>> Warning message:
>> executing %dopar% sequentially: no parallel backend registered
>>> str(knnFit1$control$index)
>> List of 10
>>  $ Fold01: int [1:135] 1 2 3 4 5 6 7 9 10 11 ...
>>  $ Fold02: int [1:135] 1 2 3 4 5 6 8 9 10 12 ...
>>  $ Fold03: int [1:135] 1 3 4 5 6 7 8 9 10 11 ...
>>  $ Fold04: int [1:135] 1 2 3 5 6 7 8 9 10 11 ...
>>  $ Fold05: int [1:135] 1 2 3 4 6 7 8 9 11 12 ...
>>  $ Fold06: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
>>  $ Fold07: int [1:135] 1 2 3 4 5 7 8 9 10 11 ...
>>  $ Fold08: int [1:135] 2 3 4 5 6 7 8 9 10 11 ...
>>  $ Fold09: int [1:135] 1 2 3 4 5 6 7 8 9 10 ...
>>  $ Fold10: int [1:135] 1 2 4 5 6 7 8 10 11 12 ...
>>
>> There is also a savePredictions argument that gives you the hold-out results.
>>
>> I'm not sure which weights you are referring to.
>>
>> On Fri, Feb 10, 2012 at 4:38 AM, Yang Zhang <yanghatespam at gmail.com> wrote:
>>> Actually, is there any way to get at additional information beyond the
>>> classProbs?  In particular, is there any way to find out the
>>> associated weights, or otherwise the row indices into the original
>>> model matrix corresponding to the tested instances?
>>>
>>> On Thu, Feb 9, 2012 at 4:37 PM, Yang Zhang <yanghatespam at gmail.com> wrote:
>>>> Oops, found trainControl's classProbs right after I sent!
>>>>
>>>> On Thu, Feb 9, 2012 at 4:30 PM, Yang Zhang <yanghatespam at gmail.com> wrote:
>>>>> I'm dealing with classification problems, and I'm trying to specify a
>>>>> custom scoring metric (recall at p, ROC, etc.) that depends on not just
>>>>> the class output but the probability estimates, so that caret::train
>>>>> can choose the optimal tuning parameters based on this metric.
>>>>>
>>>>> However, when I supply a trainControl summaryFunction, the data given
>>>>> to it contains only class predictions, so the only metrics possible
>>>>> are things like accuracy, kappa, etc.
>>>>>
>>>>> Is there any way to do this that I'm looking?  If not, could I put
>>>>> this in as a feature request?  Thanks!
>>>>>
>>>>> --
>>>>> Yang Zhang
>>>>> http://yz.mit.edu/
>>>>
>>>>
>>>>
>>>> --
>>>> Yang Zhang
>>>> http://yz.mit.edu/
>>>
>>>
>>>
>>> --
>>> Yang Zhang
>>> http://yz.mit.edu/
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Max
>
>
>
> --
> Yang Zhang
> http://yz.mit.edu/



-- 
Yang Zhang
http://yz.mit.edu/



More information about the R-help mailing list