[R] glm.fit: fitted probabilities numerically 0 or 1 occurred & glm.fit: algorithm did not converge

David Winsemius dwinsemius at comcast.net
Fri Aug 12 21:37:00 CEST 2016


> On Aug 12, 2016, at 11:32 AM, Shivi Bhatia <shivipmp82 at gmail.com> wrote:
> 
> Hi Michael,
> 
> In all the masking process some of the variables were missed. Please find
> the updated file.
> 
> Also here is the updated code: (i am removed one of the var as it had
> missing information):
> 
> glm.fit= glm(survey ~ support_cat + region+ support_lvl+ skill_group+
> application_area+ functional_area+
>          repS+ case_age+ case_status+ severity_level+
>          sla_status, data = new, family = binomial)

I think you need to do some more data cleaning:

> with(new, table(survey, repS, severity_level) )
, , severity_level = 

      repS
survey   0   1
     0   0   0
     1   0   0

, , severity_level = high

      repS
survey   0   1
     0  52  18
     1   4 193

, , severity_level = medium

      repS
survey   0   1
     0  69  16
     1   7 367

, , severity_level = no

      repS
survey   0   1
     0   0   0
     1   0   1

, , severity_level = none

      repS
survey   0   1
     0  31  19
     1   4 183

-- 
David.


> Kindly assist with the same.
> 
> On Fri, Aug 12, 2016 at 11:05 PM, Michael Dewey <lists at dewey.myzen.co.uk>
> wrote:
> 
>> Your example code refers to a variable which is not in your dataset (repS)
>> so I get an error message. If I assume repS is in fact rep_score I get
>> another variable not found (delivery_segmentation).
>> 
>> I am afraid that I am unable to sort that one out so this is going to
>> remain a mystery. I endorse Bert's suggestion of getting local help.
>> 
>> On 12/08/2016 17:24, Shivi Bhatia wrote:
>> 
>>> Hi Bert,
>>> 
>>> Does this text file help. Apologies if this does not help as i have a
>>> hard time on many occasions to get a reproducible example.
>>> 
>>> If this doesn't work a CSV with only 100kb of data i can share.
>>> 
>>> Regards, Shivi
>>> 
>>> On Fri, Aug 12, 2016 at 8:50 PM, Shivi Bhatia <shivipmp82 at gmail.com
>>> <mailto:shivipmp82 at gmail.com>> wrote:
>>> 
>>>    Sure Burt, i will share the data after masking it.  it isn't big
>>> 
>>>    regards, Shivi
>>> 
>>>    On Fri, Aug 12, 2016 at 8:36 PM, Bert Gunter <bgunter.4567 at gmail.com
>>>    <mailto:bgunter.4567 at gmail.com>> wrote:
>>> 
>>>        1. No, changing to factor will make no difference.
>>> 
>>>        2. I think that most likely your problem is your model is not
>>>        estimable/your design matrix is singular.  You should resolve
>>>        this by
>>>        consulting with a local statistical expert or, if your data set
>>>        is not
>>>        too large or confidential, posting your full dataset using
>>>        dput() (see
>>>        ?dput for how to do this).
>>> 
>>>        Cheers,
>>>        Bert
>>>        Bert Gunter
>>> 
>>>        "The trouble with having an open mind is that people keep coming
>>>        along
>>>        and sticking things into it."
>>>        -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>> 
>>> 
>>>        On Fri, Aug 12, 2016 at 7:58 AM, Shivi Bhatia
>>>        <shivipmp82 at gmail.com <mailto:shivipmp82 at gmail.com>> wrote:
>>>> Hi Michael,
>>>> 
>>>> There is no output as the model does not generate any
>>>        coefficients and
>>>> simply throws this error.
>>>> 
>>>> I hope you are not asking for a reproducible example.
>>>> 
>>>> On Fri, Aug 12, 2016 at 7:30 PM, Michael Dewey
>>>        <lists at dewey.myzen.co.uk <mailto:lists at dewey.myzen.co.uk>>
>>> 
>>>> wrote:
>>>> 
>>>>> Dear Shivi
>>>>> 
>>>>> Can you show us the output?
>>>>> 
>>>>> And please do not post in HTML as it will mangle your post into
>>>>> unreadability.
>>>>> 
>>>>> On 12/08/2016 10:10, Shivi Bhatia wrote:
>>>>> 
>>>>>> Hi Team,
>>>>>> 
>>>>>> I am creating *my first* Logistic regression on R Studio. I
>>>        am working on
>>>>>> a
>>>>>> 
>>>>>> C-SAT data where rating (score) 0-8 is a dis-sat whereas
>>>        9-10 are SAT. As
>>>>>> these were in numeric form so i had as below created 2
>>> classes:
>>>>>> 
>>>>>> new$survey[new$score>=0 & new$score<=8]<- 0
>>>>>> new$survey[new$score>=9]<- 1
>>>>>> This works fine however the class still shows as "numeric"
>>>        and levels
>>>>>> shows
>>>>>> as "NULL". Do i still need to use "as.factor" to let R know
>>>        these are
>>>>>> categorical variables.
>>>>>> 
>>>>>> Also i have used the below code to run a logistic regression
>>>        with all the
>>>>>> possible predictor variables:
>>>>>> glm.fit= glm(survey ~ support_cat + region+ support_lvl+
>>>        skill_group+
>>>>>> application_area+ functional_area+
>>>>>>          repS+ case_age+ case_status+ severity_level+
>>>>>>          sla_status+ delivery_segmentation, data = SFDC,
>>>        family =
>>>>>> binomial)
>>>>>> 
>>>>>> But it throws an error:-
>>>>>> Warning messages:
>>>>>> 1: glm.fit: algorithm did not converge
>>>>>> 2: glm.fit: fitted probabilities numerically 0 or 1 occurred
>>>>>> 
>>>>>> I checked online for the error and it says:
>>>>>> "glm() uses an iterative re-weighted least squares
>>>        algorithm. The
>>>>>> algorithm
>>>>>> hit the maximum number of allowed iterations before signalling
>>>>>> convergence.
>>>>>> The default,
>>>>>> documented in ?glm.control is 25."
>>>>>> 
>>>>>> Kindly suggest on the above case and if i have to change my
>>>        outcome var as
>>>>>> as.factor.
>>>>>> 
>>>>>> Thank you, Shivi
>>>>>> 
>>>>>>        [[alternative HTML version deleted]]
>>>>>> 
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
>>>        list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>        <https://stat.ethz.ch/mailman/listinfo/r-help>
>>>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posti
>>>>>> ng-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible
>>>        code.
>>>>>> 
>>>>>> 
>>>>> --
>>>>> Michael
>>>>> http://www.dewey.myzen.co.uk/home.html
>>>        <http://www.dewey.myzen.co.uk/home.html>
>>>>> 
>>>> 
>>>>        [[alternative HTML version deleted]]
>>>> 
>>>> ______________________________________________
>>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing
>>>        list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>        <https://stat.ethz.ch/mailman/listinfo/r-help>
>>>> PLEASE do read the posting guide
>>>        http://www.R-project.org/posting-guide.html
>>>        <http://www.R-project.org/posting-guide.html>
>>>> and provide commented, minimal, self-contained, reproducible
>>> code.
>>> 
>>> 
>>> 
>>> 
>> --
>> Michael
>> http://www.dewey.myzen.co.uk/home.html
>> 
> <saved.txt>______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list