[R] Logistic Regression - Interpreting SENS (Sensitivity) and SPEC (Specificity)

Prof Brian Ripley ripley at stats.ox.ac.uk
Mon Oct 13 09:52:30 CEST 2008

On Mon, 13 Oct 2008, Peter Dalgaard wrote:

> Dieter Menne wrote:
>> Maithili Shiva <maithili_shiva <at> yahoo.com> writes:
>>> I havd main sample of 42500 clentes and
>>> based on their status as regards to defaulted / non - defaulted, I have
>> genereted the probability of default.
>>> I have a hold out sample of 5000 clients. I have calculated (1) No of
>> correctly classified goods Gg, (2) No of
>>> correcly classified Bads Bg and also (3) number of wrongly classified bads
>> (Gb) and (4) number of wrongly
>>> classified goods (Bg).
>> The simple and wrong answer is to use these data directly to compute 
>> sensitivity
>> (fraction of hits). This measure is useless, but I encounter it often in 
>> medical
>> publications.
>> You can get a more reasonable answer by using cross-validation. Check, for
>> example, Frank Harrell's 
>> http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
> But if he has a "hold out sample", isn't he already cross-validating??  I 
> wonder if you're answering the right question there. Could he just be looking 
> for Sp=Gg/(Gg+Bg), Se=Bb/(Gb+Bb)? (If I got the notation right.)

Strictly no, she is 'validating' (no cross- involved).  Cross-validation 
would be a better idea for much smaller sample sizes (we don't know how 
many regressors are involved, so say hundreds unless there are more than 
ten regressors).

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-help mailing list