[R] Question about ROCR package

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sun Feb 8 16:27:58 CET 2009


Tobias Sing wrote:
> Waverley,
> 
> you can also use perf at y.values to access the slot (see
> help(performance-class) for a description of the slots).
> 
> You might also want have a look at the code for demo(ROCR) and at this
> slide deck:
> http://rocr.bioinf.mpi-sb.mpg.de/ROCR_Talk_Tobias_Sing.ppt
> 
> HTH,
>   Tobias

Tobias,

In my view there is one significant omission from your handout: high 
resolution calibration curves.  There is a need to show that predictive 
models predict accurately.  See for example the val.prob function in the 
  Design package.  The many graphs related to cumulative probabilities 
are nice, but in some ways they get in the way of the fundamental 
elements of absolute accuracy (calibration curves) and predictive 
descrimination (simple histogram of predicted probabilities ignoring Y). 
  I go into this in my 1996 Stat in Med paper.  In my view the 
continuous accuracy measures need to be examined first, because 
dichotomizations provide only crude approximations to be plugged into 
decision making.  Dichotomizations (classifiers) may provide good 
decisions for a group of subjects but not so good decisions for every 
individual member of the group.  For one thing, different group members 
have different loss/utility functions.  For another, a predicted 
probability of 0.5 may often best be summarized as "collect another 
predictor variable for this subject."

Related to this is that ROC-type measures result in a decision rule for 
one subject that is a function of all the data of all the subjects in 
the sample.  This violates a basic principle of optimum Bayes decisions. 
  A related reference is below.

Just my $.02.

Frank

@Article{bri08ski,
   author = 		 {Briggs, William M. and Zaretzki, Russell},
   title = 		 {The skill plot: {A} graphical technique for evaluating 
continuous diagnostic tests (with discussion)},
   journal = 	 Biometrics,
   year = 		 2008,
   volume = 	 63,
   pages = 	 {250-261},
   annote = 	 {ROC curve;sensitivity;skill plot;skill 
score;specificity;diagnostic accuracy;diagnosis;``statistics such as the 
AUC are not especially relevant to someone who must make a decision 
about a particular $x_{c}$.  \ldots ROC curves lack or obscure several 
quantities that are necessary for evaluating the operational 
effectiveness of diagnostic tests. \ldots ROC curves were first used to 
check how radio \emph{receivers} (like radar receivers) operated over a 
range of frequencies. \ldots This is not how must ROC curves are used 
now, particularly in medicine.  The receiver of a diagnostic measurement 
\ldots wants to make a decision based on some $x_{c}$, and is not 
especially interested in how well he would have done had he used some 
different cutoff.''; in the discussion David Hand states ``when 
integrating to yield the overall AUC measure, it is necessary to decide 
what weight to give each value in the integration.  The AUC implicitly 
does this using a weighting derived empirically from the data.  This is 
nonsensical.  The relative importance of misclassifying a case as a 
noncase, compared to the reverse, cannot come from the data itself.  It 
must come externally, from considerations of the severity one attaches 
to the different kinds of misclassifications.''}
}

> 
> On Sat, Feb 7, 2009 at 10:40 PM, Jorge Ivan Velez
> <jorgeivanvelez at gmail.com> wrote:
>> Hi Waverley,
>> I forgot to tell you that "perf" is your performance object. Here is an
>> example from the ROCR package:
>> ## computing a simple ROC curve (x-axis: fpr, y-axis: tpr)
>> library(ROCR)
>> data(ROCR.simple)
>> pred <- prediction( ROCR.simple$predictions, ROCR.simple$labels)
>> perf <- performance(pred,"tpr","fpr")
>>
>> # y.values
>> unlist(slot(perf,"y.values"))
>>
>> HTH,
>>
>> Jorge
>>
>>
>>
>>> On Sat, Feb 7, 2009 at 3:17 PM, Waverley <waverley.paloalto at gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a question about ROCR package.  I got the ROC curve plotted
>>>> without any problem following the manual.  However, I don't know to
>>>> extract the values, e.g. y.values ( I think it is the area under the
>>>> curve auc measure).  The return is an object of class "performance"
>>>> which have Slots and one of the slot is "y.values".  I type the object
>>>> and I can see them in screen.  But I want to extract the value for
>>>> further programming and computation.  I did a summary of the object
>>>> and it is a "S4" mode which I don't understand.
>>>>
>>>> Can someone help?
>>>>
>>>> Thanks a lot in advance.
>>>>
>>>> --
>>>> Waverley @ Palo Alto
>>>>
-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list