[R] Optimal Y>=q cutoff after logistic regression

David Winsemius dwinsemius at comcast.net
Mon Feb 14 06:45:47 CET 2011


On Feb 14, 2011, at 12:31 AM, Daniel Weitzenfeld wrote:

> Hi,
>
> I understand that dichotimization of the predicted probabilities after
> logistic regression is philosophically questionable, throwing out
> information, etc.
>
> But I want to do it anyway.  I'd like to include as a measure of fit %
> of observations correctly classified because it's measured in units
> that non-statisticians can understand more easily  than area under the
> ROC curve, Dxy, etc.
>
> Am I right that there is an optimal Y>=q probability cutoff, at which
> the True Positive Rate is high and the False Positive Rate is low?

Only if the data supports it.

> Visually, it would be the elbow in the ROC curve, right?

If there is an "elbow", perhaps. The real answer is that you should  
thoughtfully consider the consequences of a wrong answer that the test  
is negative (False -) and those of a wrong answer that a test is  
positive (False +)  and then make a decision that properly balances  
both the costs sand the probabilities.


> My reasoning is that even if you had a near-perfect model, you could
> set a stupidly low (high) cutoff and have a higher false positive
> (negative) rate than would be optimal.
>
> I know the standard default or starting point is Y>=.5,

Huh... what is Y?

> but if my
> above reasoning is correct, there ought to be an optimal cutoff for a
> given model.  Is there an easy way to determine that cutoff in R
> without writing my own script to iterate through possible breakpoints
> and calculating classification accuracy at each one?

There are packages that handle ROC analyses.

>
> Thanks in advance.
> -Dan
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list