[R] Changing the classification threshold for cost function

SiBorg simon.dulku at doctors.net.uk
Thu Aug 2 21:18:33 CEST 2012


Dear All

I am trying to perform leave-one-out cross validation on a logistic
regression model using cv.glm from the boot package in R.

As I understand it, the standard cost function:

cost<-function(r,pi=0) mean(abs(r-pi)>0.5)

Uses a 50% risk threshold to classify cases as positive or negative and
calculates the prediction error based on this.

I would like to alter this threshold to, say, nearer 5% or 1%, and calculate
the prediction error at the lower level.  This is because in my model, none
of the patients are at more than 50% risk of developing the condition before
they are screened, hence 50% gives no useful information on the model.  The
model is more useful in removing the very low risk patients from the number
needed to screen.

Can anybody tell me how to rewrite the cost function so that the
classification error at the 5% or 1% level can be used as the cut-off
instead of the 50% level.  Furthermore, in this scenario false positives
(i.e. patients who are predicted to be at higher risk but then don't develop
the condition) are more acceptable than false negatives (patients who are
predicted at low risk but then get the condition).  This is because any
patient at low risk doesn't get screened and therefore doesn't get treated.

Is there any way of penalising the model only on false negatives rather than
false positives at the lower cut-off value?

Thanks in advance for your help.



--
View this message in context: http://r.789695.n4.nabble.com/Changing-the-classification-threshold-for-cost-function-tp4638936.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list