[R] Dependent Variable in Logistic Regression

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Sat Aug 1 21:48:12 CEST 2020


Hello,

Inline.

Às 20:01 de 01/08/2020, John Fox escreveu:
> Dear Paul,
>
> I think that this thread has gotten unnecessarily complicated. The 
> answer, as is easily demonstrated, is that a binary response for a 
> binomial GLM in glm() may be a factor, a numeric variable, or a 
> logical variable, with identical results; for example:
>
> --------------- snip -------------
>
> > set.seed(123)
>
> > head(x <- rnorm(100))
> [1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774 
> 1.71506499
>
> > head(y <- rbinom(100, 1, 1/(1 + exp(-x))))
> [1] 0 1 1 1 1 0
>
> > head(yf <- as.factor(y))
> [1] 0 1 1 1 1 0
> Levels: 0 1
>
> > head(yl <- y == 1)
> [1] FALSE  TRUE  TRUE  TRUE  TRUE FALSE
>
> > glm(y ~ x, family=binomial)
>
> Call:  glm(formula = y ~ x, family = binomial)
>
> Coefficients:
> (Intercept)            x
>      0.3995       1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> Null Deviance:        134.6
> Residual Deviance: 114.9     AIC: 118.9
>
> > glm(yf ~ x, family=binomial)
>
> Call:  glm(formula = yf ~ x, family = binomial)
>
> Coefficients:
> (Intercept)            x
>      0.3995       1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> Null Deviance:        134.6
> Residual Deviance: 114.9     AIC: 118.9
>
> > glm(yl ~ x, family=binomial)
>
> Call:  glm(formula = yl ~ x, family = binomial)
>
> Coefficients:
> (Intercept)            x
>      0.3995       1.1670
>
> Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
> Null Deviance:        134.6
> Residual Deviance: 114.9     AIC: 118.9
>
> --------------- snip -------------
>
> The original poster claimed to have encountered an error with a 0/1 
> numeric response, but didn't show any data or even a command. I 
> suspect that the response was a character variable, but of course 
> can't really know that.

So continuing with your example:

 > head(yc <- as.character(y))
[1] "0" "1" "1" "1" "1" "0"
 > glm(yc ~ x, family=binomial)
Error in weights * y : non-numeric argument to binary operator


But the OP says that

[...] R complains that I should make the dependent variable a factor.

That is not what the error message says, it "asks" for a numeric 
argument to the '*' operator.
We haven't seen the exact R message yet, so, like others have said, the 
OP should post it along with code.

Hope this helps,

Rui Barradas

>
> Best,
>  John
>
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://socialsciences.mcmaster.ca/jfox/
>
> On 2020-08-01 2:25 p.m., Paul Bernal wrote:
>> Dear friend,
>>
>> I am aware that I have a binomial dependent variable, which is covid 
>> status
>> (1 if covid positive, and 0 otherwise).
>>
>> My question was if R requires to turn a binomial response variable 
>> into a
>> factor or not, that's all.
>>
>> Cheers,
>>
>> Paul
>>
>> El sáb., 1 de agosto de 2020 1:22 p. m., Bert Gunter 
>> <bgunter.4567 using gmail.com>
>> escribió:
>>
>>> ... yes, but so does lm() for a categorical **INdependent** variable 
>>> with
>>> more than 2 numerically labeled levels. n levels  = (n-1) df for a
>>> categorical covariate, but 1 for a continuous one (unless more complex
>>> models are explicitly specified of course). As I said, the OP seems
>>> confused about whether he is referring to the response or 
>>> covariates. Or
>>> maybe he just made the same typo I did.
>>>
>>> Bert Gunter
>>>
>>> "The trouble with having an open mind is that people keep coming 
>>> along and
>>> sticking things into it."
>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>
>>>
>>> On Sat, Aug 1, 2020 at 11:15 AM Patrick (Malone Quantitative) <
>>> malone using malonequantitative.com> wrote:
>>>
>>>> No, R does not. glm() does in order to do logistic regression.
>>>>
>>>> On Sat, Aug 1, 2020 at 2:11 PM Paul Bernal <paulbernal07 using gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Bert,
>>>>>
>>>>> Thank you for the kind reply.
>>>>>
>>>>> But what if I don't turn the variable into a factor. Let's say 
>>>>> that in
>>>>> excel I just coded the variable as 1s and 0s and just imported the
>>>>> dataset
>>>>> into R and fitted the logistic regression without turning any 
>>>>> categorical
>>>>> variable or dummy variable into a factor?
>>>>>
>>>>> Does R requires every dummy variable to be treated as a factor?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Paul
>>>>>
>>>>> El sáb., 1 de agosto de 2020 12:59 p. m., Bert Gunter <
>>>>> bgunter.4567 using gmail.com> escribió:
>>>>>
>>>>>> x <- factor(0:1)
>>>>>> x <- factor("yes","no")
>>>>>>
>>>>>> will produce identical results up to labeling.
>>>>>>
>>>>>>
>>>>>> Bert Gunter
>>>>>>
>>>>>> "The trouble with having an open mind is that people keep coming 
>>>>>> along
>>>>> and
>>>>>> sticking things into it."
>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>>>>>
>>>>>>
>>>>>> On Sat, Aug 1, 2020 at 10:40 AM Paul Bernal <paulbernal07 using gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Dear friends,
>>>>>>>
>>>>>>> Hope you are doing great. I want to fit a logistic regression in R,
>>>>> where
>>>>>>> the dependent variable is the covid status (I used 1 for covid
>>>>> positives,
>>>>>>> and 0 for covid negatives), but when I ran the glm, R complains 
>>>>>>> that I
>>>>>>> should make the dependent variable a factor.
>>>>>>>
>>>>>>> What would be more advisable, to keep the dependent variable 
>>>>>>> with 1s
>>>>> and
>>>>>>> 0s, or code it as yes/no and then make it a factor?
>>>>>>>
>>>>>>> Any guidance will be greatly appreciated,
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Paul
>>>>>>>
>>>>>>>          [[alternative HTML version deleted]]
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>
>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>> -- 
>>>> Patrick S. Malone, Ph.D., Malone Quantitative
>>>> NEW Service Models: http://malonequantitative.com
>>>>
>>>> He/Him/His
>>>>
>>>
>>
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus



More information about the R-help mailing list