[R] logistic regression model + Cross-Validation

Frank E Harrell Jr f.harrell at vanderbilt.edu
Mon Jan 22 00:17:32 CET 2007


nitin jindal wrote:
> If validate.lrm does not has this option, do any other function has it.
> I will certainly look into your advice on cross validation. Thnx.
> 
> nitin

Not that I know of, but easy to program.
Frank

> 
> On 1/21/07, Frank E Harrell Jr <f.harrell at vanderbilt.edu> wrote:
>> nitin jindal wrote:
>>> Hi,
>>>
>>> I am trying to cross-validate a logistic regression model.
>>> I am using logistic regression model (lrm) of package Design.
>>>
>>> f <- lrm( cy ~ x1 + x2, x=TRUE, y=TRUE)
>>> val <- validate.lrm(f, method="cross", B=5)
>> val <- validate(f, ...)    # .lrm not needed
>>
>>> My class cy has values 0 and 1.
>>>
>>> "val" variable will give me indicators like slope and AUC. But, I also
>> need
>>> the vector of predicted values of class variable "cy" for each record
>> while
>>> cross-validation, so that I can manually look at the results. So, is
>> there
>>> any way to get those probabilities assigned to each class.
>>>
>>> regards,
>>> Nitin
>> No, validate.lrm does not have that option.  Manually looking at the
>> results will not be easy when you do enough cross-validations.  A single
>> 5-fold cross-validation does not provide accurate estimates.  Either use
>> the bootstrap or repeat k-fold cross-validation between 20 and 50 times.
>>   k is often 10 but the optimum value may not be 10.  Code for averaging
>> repeated cross-validations is in
>> http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RmS/logistic.val.pdf
>> along with simulations of bootstrap vs. a few cross-validation methods
>> for binary logistic models.
>>
>> Frank
>> --
>> Frank E Harrell Jr   Professor and Chair           School of Medicine
>>                       Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list