[R] ROC optimal threshold

Michael Kubovy kubovy at virginia.edu
Fri Mar 31 15:54:12 CEST 2006


Hi Tim and José,

>> Date: Fri, 31 Mar 2006 11:58:14 +0200
>> From: "Anadon Herrera, Jose Daniel" <jdanadon at umh.es>
>> Subject: [R] ROC optimal threshold
>>
>> I am using the ROC package to evaluate predictive models
>> I have successfully plot the ROC curve, however
>>
>> ?is there anyway to obtain the value of operating point=optimal  
>> threshold
>> value (i.e. the nearest point of the curve to the top-left corner  
>> of the
>> axes)?

On Mar 31, 2006, at 8:01 AM, Tim Howard wrote:

> I've struggled a bit with the same question, said another way: "how  
> do you find the value in a ROC curve that minimizes false positives  
> while maximizing true positives"?
>
> Here's something I've come up with. I'd be curious to hear from the  
> list whether anyone thinks this code might get stuck in local  
> minima, or if it does find the global minimum each time. (I think  
> it's ok).
>
>> From your ROC object you need to grab the sensitivity (=true  
>> positive rate) and specificity (= 1- false positive rate) and the  
>> cutoff levels.  Then find the value that minimizes abs(sensitivity- 
>> specificity), or  sqrt((1-sens)^2)+(1-spec)^2)) as follows:
>
> absMin <- extract[which.min(abs(extract$sens-extract$spec)),];
> sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract 
> $spec)^2)),];
>
> In this example, 'extract' is a dataframe containing three columns:  
> extract$sens = sensitivity values, extract$spec = specificity  
> values, extract$votes = cutoff values. The command subsets the  
> dataframe to a single row containing the desired cutoff and the  
> sens and spec values that are associated with it.
>
> Most of the time these two answers (abs or sqrt) are the same,  
> sometimes they differ quite a bit.
>
> I do not see this application of ROC curves very often. A question  
> for those much more knowledgeable than I.... is there a problem  
> with using ROC curves in this manner?
>
> Tim Howard

@BOOK{MacmillanCreelman2005,
   title = {Detection theory: {A} user's guide},
   publisher = {Lawrence Erlbaum Associates},
   year = {2005},
   address = {Mahwah, NJ, USA},
   edition = {2nd},
   author = {Macmillan, Neil A and Creelman, C Douglas},
}
on p. 43 shows that the ideal value of the cutoff depends on the  
reward function R that specifies the payoff for each outcome:
\[
LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true  
positive) - R(false negative)} \frac{p(noise)}{p(signal)}
\]

I believe that your attempt to minimize false positives while  
maximizing true positives amounts to maximizing the proportion of  
correct answers. For that you just set $\beta = 0$. Otherwise it  
might be best to explicitly state your costs and benefits by  
specifying the reward function R.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/




More information about the R-help mailing list