[R] Multicollinearty in logistic regression models

David Winsemius dwinsemius at comcast.net
Fri Dec 16 13:47:10 CET 2011


On Dec 15, 2011, at 7:41 PM, <cberry at tajo.ucsd.edu> wrote:

> David Winsemius <dwinsemius at comcast.net> writes:
>
>> On Dec 15, 2011, at 11:34 AM, Mohamed Lajnef wrote:
>>
>>> Dear All,
>>>
>>> Is there a method to diagnostic multicollinearty in logistic
>>> regression
>>> models  like vif indicator in linear regression ( variance inflation
>>> Factor ...) ?
>>>
>>
>> Wouldn't matrix representation of the predictor "side" of the
>> regression be the same? Couldn't you just use the same methods you
>> employ for linear regression?

Harrell's rms package has a vif function that is intended for use with  
fits from his logistic regression model function, lrm. This uses the  
variance covariance matrix from the last iteration of the fitting  
process alluded to below and Bert Gunter's reply.

>
> Trouble is that in logistic regression the Fisher Information for each
> case has a factor of p[i]*(1-p[i]) (where 'p' is the vector of success
> probabilites and 'i' indexes which case).
>
> If the value of p[i] is very near one or zero, then the information
> provided is scant. And this will happen if you have a really good
> predictor in the mix.

>
> Even with an orthogonal design, you can wind up with huge variances.  
> And
> you can have an ill-conditioned var-cov matrix for the coefficients
> depending on how different cases get weighted. Thus, you could get the
> equivalent of multicollinearity even with an orthogonal design.
>
> And the diagnostics for linear regresson would not be all
> that helpful if you have a good predictor.

>
> OTOH, if the predictors were collectively pretty weak, the linear
> regression diagnostics might be OK.
>
> Mu advice: Google Scholar 'pregibon logistic regression', click  
> where it
> says 'cited by ...' and page through the results to find good leads on
> this topic.

Yes, that was an interesting exercise. Brought back many fond  
memories. In my training we constructed  df-betas and df-deviance  
using GLIM macros. I think we may even have been given Pregibon's  
paper, since he was a local boy (UW). When one learns logistic  
regression on GLIM interacting with a VAX over a TTY line, R is a real  
treat.

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list