[R] Collinearity? Cannot get logisticRidge{ridge} to work

David Winsemius dwinsemius at comcast.net
Wed May 27 20:57:37 CEST 2015


On May 27, 2015, at 10:10 AM, Kengo Inagaki wrote:

> I am currently working on a health care related project using R. I am
> learning R while working on data analysis.
> 
> Below is the part of the data in which i am encountering a problem.
> 
> 
> Case#    Sex         Therapy1             Therapy2             Outcome
> 
> 1              male      no
> no                           Alive
> 

snipped mangled data sent in HTML

> 
> 
> "Outcome" is the response variable and "Sex", "Therapy1", "Therapy2" are
> predictor variables.
> 
> All of the predictors are significantly associated with the outcome by
> univariate analysis.
> 
> Logistic regression runs fine with most of the predictors when "Sex" and
> "Therapy1" are not included at the same time (This is a part of table that
> I cut out from a larger table for ease of
> 
> presentation and there are more predictors that i tested).

Please examine the data before reaching for ridge regression:

What does this show: ...

    with(a,  table(Sex, Therapy1) )

I predict you will see a zero cell entry. The read about "complete separation" and the so-called "Hauck-Donner effect".

-- 
David.
> 
> However, when "Sex" and "Therapy1" are included in logistic regression
> model at the same time, standard error inflates and p value gets close to 1.
> 
> The formula used is,
> 
> 
> 
>> Model<-glm(Outcome~Sex+Therapy1,data=a,family=binomial) #I assigned a
> vector "a" to represent above table.
> 
> 
> 
> After doing some reading, I suspect this might be collinearity, as vif
> values (using "vif()" function in car package) were sky high (8,875,841 for
> both "Sex" and "Therapy1").
> 
> Learning that ridge regression may be a solution, I attempted using
> logisticRidge {ridge} using the following formula, but i get the
> accomapnying error message.
> 
> 
> 
>> logisticRidge(a$Outcome~a$Sex+a$Therapy1)
> 
> 
> 
> Error in ifelse(y, log(p), log(1 - p)) :
> 
>  invalid to change the storage mode of a factor
> 
> 
> 
> At this point I do not have an idea how to solve this and would like to
> seek help.
> 
> I really really appreciate your input!!!
> 
> 	[[alternative HTML version deleted]]
> 


David Winsemius
Alameda, CA, USA



More information about the R-help mailing list