[R] Reporting binomial logistic regression from R results

peter dalgaard pd@lgd @ending from gm@il@com
Mon Nov 12 14:46:07 CET 2018


Yes, only one of the pairwise comparisons (B vs. C) is right. Also, the overall test has 3 degrees of freedom whereas a comparison of 3 groups should have 2. You (meaning Frodo) are testing that _all 3_ regression coefficients are zero, intercept included. That would imply that all three systems have response probablilities og 0.5, which is not likely what you want.

This all suggests that you are struggling with the interpretation of the regression coefficients and their role in the linear predictor. This should be covered by any good book on logistic regression.

-pd  

> On 12 Nov 2018, at 14:15 , Eik Vettorazzi <E.Vettorazzi using uke.de> wrote:
> 
> Dear Jedi,
> please use the source carefully. A and C are not statistically different at the 5% level, which can be inferred from glm output. Your last two wald.tests don't test what you want to, since your model contains an intercept term. You specified contrasts which tests A vs B-A, ie A- (B-A)==0 <-> 2*A-B==0 which is not intended I think. Have a look at ?contr.treatment and re-read your source doc to get an idea what dummy coding and indicatr variables are about.
> 
> Cheers
> 
> 
> Am 12.11.2018 um 02:07 schrieb Frodo Jedi:
>> Dear list members,
>> I need some help in understanding whether I am doing correctly a binomial
>> logistic regression and whether I am interpreting the results in the
>> correct way. Also I would need an advice regarding the reporting of the
>> results from the R functions.
>> I want to report the results of a binomial logistic regression where I want
>> to assess difference between the 3 levels of a factor (called System) on
>> the dependent variable (called Response) taking two values, 0 and 1. My
>> goal is to understand if the effect of the 3 systems (A,B,C) in System
>> affect differently Response in a significant way. I am basing my analysis
>> on this URL: https://stats.idre.ucla.edu/r/dae/logit-regression/
>> This is the result of my analysis:
>>> fit <- glm(Response ~ System, data = scrd, family = "binomial")
>>> summary(fit)
>> Call:
>> glm(formula = Response ~ System, family = "binomial", data = scrd)
>> Deviance Residuals:
>>     Min       1Q   Median       3Q      Max
>> -2.8840   0.1775   0.2712   0.2712   0.5008
>> Coefficients:
>>              Estimate Std. Error z value Pr(>|z|)
>> (Intercept)    3.2844     0.2825  11.626  < 2e-16 ***
>> SystemB  -1.2715     0.3379  -3.763 0.000168 ***
>> SystemC    0.8588     0.4990   1.721 0.085266 .
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> (Dispersion parameter for binomial family taken to be 1)
>>     Null deviance: 411.26  on 1023  degrees of freedom
>> Residual deviance: 376.76  on 1021  degrees of freedom
>> AIC: 382.76
>> Number of Fisher Scoring iterations: 6
>> Following this analysis I perform the wald test in order to understand
>> whether there is an overall effect of System:
>> library(aod)
>>> wald.test(b = coef(fit), Sigma = vcov(fit), Terms = 1:3)
>> Wald test:
>> ----------
>> Chi-squared test:
>> X2 = 354.6, df = 3, P(> X2) = 0.0
>> The chi-squared test statistic of 354.6, with 3 degrees of freedom is
>> associated with a p-value < 0.001 indicating that the overall effect of
>> System is statistically significant.
>> Now I check whether there are differences between the coefficients using
>> again the wald test:
>> # Here difference between system B and C:
>>> l <- cbind(0, 1, -1)
>>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
>> Wald test:
>> ----------
>> Chi-squared test:
>> X2 = 22.3, df = 1, P(> X2) = 2.3e-06
>> # Here difference between system A and C:
>>> l <- cbind(1, 0, -1)
>>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
>> Wald test:
>> ----------
>> Chi-squared test:
>> X2 = 12.0, df = 1, P(> X2) = 0.00052
>> # Here difference between system A and B:
>>> l <- cbind(1, -1, 0)
>>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
>> Wald test:
>> ----------
>> Chi-squared test:
>> X2 = 58.7, df = 1, P(> X2) = 1.8e-14
>> My understanding is that from this analysis I can state that the three
>> systems lead to a significantly different Response. Am I right? If so, how
>> should I report the results of this analysis? What is the correct way?
>> Thanks in advance
>> Best wishes
>> FJ
>> 	[[alternative HTML version deleted]]
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Eik Vettorazzi
> 
> Department of Medical Biometry and Epidemiology
> University Medical Center Hamburg-Eppendorf
> 
> Martinistrasse 52
> building W 34
> 20246 Hamburg
> 
> Phone: +49 (0) 40 7410 - 58243
> Fax:   +49 (0) 40 7410 - 57790
> Web: www.uke.de/imbe
> --
> 
> _____________________________________________________________________
> 
> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Marya Verdel
> _____________________________________________________________________
> 
> SAVE PAPER - THINK BEFORE PRINTING
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-help mailing list