[R] [FORGED] Dependent Variable in Logistic Regression

Rolf Turner r@turner @end|ng |rom @uck|@nd@@c@nz
Sun Aug 2 02:46:40 CEST 2020


On 2/08/20 5:39 am, Paul Bernal wrote:

> Dear friends,
> 
> Hope you are doing great. I want to fit a logistic regression in R, where
> the dependent variable is the covid status (I used 1 for covid positives,
> and 0 for covid negatives), but when I ran the glm, R complains that I
> should make the dependent variable a factor.
> 
> What would be more advisable, to keep the dependent variable with 1s and
> 0s, or code it as yes/no and then make it a factor?
> 
> Any guidance will be greatly appreciated,


There have been many responses to this post, the majority of them being 
confusing and off the point.

BOTTOM LINE:  R/glm() does *NOT* complain that one "should make the 
dependent variable a factor".   This is bovine faecal output.

As Rui Barradas has pointed out (alternatively: RTFM!) when you fit a 
Bernoulli model using glm(), your response/dependent variable is allowed 
to be

     * a numeric variable with values 0 or 1
     * a logical variable
     * a factor with two levels

The OP presumably fed glm() a *character* vector with values "0" and 
"1".  Doing *this* will cause glm() to whinge.

I reiterate:  RTFM!!!  (And perhaps learn to distinguish between 
character vectors and factors.)

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276



More information about the R-help mailing list