[R] Calculate Specificity and Sensitivity for a given threshold value

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Nov 13 20:17:47 CET 2008


Pierre-Jean-EXT.Breton at sanofi-aventis.com wrote:
> Hi Frank,
> 
> Thank you for your answer. 
> In fact, I don't use this for clinical research practice.
> I am currently testing several scoring methods and I'd like
> to know which one is the most effective and which threshold
> value I should apply to discriminate positives and negatives.
> So, any idea for my problem ?

The use of thresholds gets in the way of finding a good solution because 
you will have predictor values in the "gray zone".  I tend to rank 
methods by the most sensitive index available such as the log likelihood 
in the binary logistic model.  You can extend ordinary logistic models 
to allow for nonlinear effects on the log odds scale using regression 
splines.

Frank

> 
> Pierre-Jean
> 
> -----Original Message-----
> From: Frank E Harrell Jr [mailto:f.harrell at vanderbilt.edu] 
> Sent: Thursday, November 13, 2008 5:00 PM
> To: Breton, Pierre-Jean-EXT R&D/FR
> Cc: r-help at r-project.org
> Subject: Re: [R] Calculate Specificity and Sensitivity for a given
> threshold value
> 
> Kaliss wrote:
>> Hi list,
>>
>>
>> I'm new to R and I'm currently using ROCR package.
>> Data in input look like this:
>>
>> DIAGNOSIS	SCORE
>> 1	0.387945
>> 1	0.50405
>> 1	0.435667
>> 1	0.358057
>> 1	0.583512
>> 1	0.387945
>> 1	0.531795
>> 1	0.527148
>> 0	0.526397
>> 0	0.372935
>> 1	0.861097
>>
>> And I run the following simple code:
>> d <- read.table("inputFile", header=TRUE); pred <- prediction(d$SCORE,
> 
>> d$DIAGNOSIS); perf <- performance( pred, "tpr", "fpr");
>> plot(perf)
>>
>> So building the curve works easily.
>> My question is: can I have the specificity and the sensitivity for a 
>> score threshold = 0.5 (for example)? How do I compute this ?
>>
>> Thank you in advance
> 
> Beware of the utility/loss function you are implicitly assuming with
> this approach.  It is quite oversimplified.  In clinical practice the
> cost of a false positive or false negative (which comes from a cost
> function and the simple forward probability of a positive diagnosis,
> e.g., from a basic logistic regression model if you start with a cohort
> study) vary with the type of patient being diagnosed.
> 
> Frank
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list