[R] Reporting binomial logistic regression from R results

Frodo Jedi frodojedi@m@ilingli@t @ending from gm@il@com
Mon Nov 12 22:45:04 CET 2018


Dear Bert,
I understand and thanks for your recommendation. Unfortunately I do not
have any possibility to contact a statistical expert at the moment. So this
forum experts' recommendation would be crucial to me to understand how R
works in relation to my question.
I hope that someone could reply to my last questions.

Best regards

FJ

On Mon, Nov 12, 2018 at 7:48 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:

> Generally speaking, this list is about questions on R programming, not
> statistical issues. However, I grant you that your queries are in something
> of a gray area intersecting both.
>
> Nevertheless, based on your admitted confusion, I would recommend that you
> find a local statistical expert with whom you can consult 1-1 if at all
> possible. As others have already noted, you statistical understanding is
> muddy, and it can be quite difficult to resolve such confusion in online
> forums like this that cannot provide the close back and forth that may be
> required (as well as further appropriate study).
>
> Best,
> Bert
>
> On Mon, Nov 12, 2018 at 11:09 AM Frodo Jedi <
> frodojedi.mailinglist using gmail.com> wrote:
>
>> Dear Peter and Eik,
>> I am very grateful to you for your replies.
>> My current understanding is that from the GLM analysis I can indeed
>> conclude that the response predicted by System A is significantly
>> different
>> from that of System B, while the pairwise comparison A vs C leads to non
>> significance. Now the Wald test seems to be correct only for Systems B vs
>> C, indicating that the pairwise System B vs System C is significant. Am I
>> correct?
>>
>> However, my current understanding is also that I should use contrasts
>> instead of the wald test. So the default contrasts is with the System A,
>> now I should re-perform the GLM with another base. I tried to use the
>> option "contrasts" of the glm:
>>
>> > fit1 <- glm(Response ~ System, data = scrd, family = "binomial",
>> contrasts = contr.treatment(3, base=1,contrasts=TRUE))
>> > summary(fit1)
>>
>> > fit2 <- glm(Response ~ System, data = scrd, family = "binomial",
>> contrasts = contr.treatment(3, base=2,contrasts=TRUE))
>> > summary(fit2)
>>
>> > fit3 <- glm(Response ~ System, data = scrd, family = "binomial",
>> contrasts = contr.treatment(3, base=3,contrasts=TRUE))
>> > summary(fit3)
>>
>> However, the output of these three summary functions are identical. Why?
>> That option should have changed the base, but apparently this is not the
>> case.
>>
>>
>> Another analysis I found online (at this link
>>
>> https://stats.stackexchange.com/questions/60352/comparing-levels-of-factors-after-a-glm-in-r
>> )
>> to understand the differences between the 3 levels is to use glth with
>> Tuckey. I performed the following:
>>
>> > library(multcomp)
>> > summary(glht(fit, mcp(System="Tukey")))
>>
>> Simultaneous Tests for General Linear Hypotheses
>>
>> Multiple Comparisons of Means: Tukey Contrasts
>>
>>
>> Fit: glm(formula = Response ~ System, family = "binomial", data = scrd)
>>
>> Linear Hypotheses:
>>                       Estimate Std. Error z value Pr(>|z|)
>> B - A == 0  -1.2715     0.3379  -3.763 0.000445 ***
>> C - A == 0    0.8588     0.4990   1.721 0.192472
>> C - B == 0     2.1303     0.4512   4.722  < 1e-04 ***
>> ---
>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> (Adjusted p values reported -- single-step method)
>>
>>
>> Is this Tukey analysis correct?
>>
>>
>> I am a bit confused on what analysis I should do. I am doing my very best
>> to study all resources I can find, but I would really need some help from
>> experts, especially in using R.
>>
>>
>> Best wishes
>>
>> FJ
>>
>>
>>
>>
>>
>>
>> On Mon, Nov 12, 2018 at 1:46 PM peter dalgaard <pdalgd using gmail.com> wrote:
>>
>> > Yes, only one of the pairwise comparisons (B vs. C) is right. Also, the
>> > overall test has 3 degrees of freedom whereas a comparison of 3 groups
>> > should have 2. You (meaning Frodo) are testing that _all 3_ regression
>> > coefficients are zero, intercept included. That would imply that all
>> three
>> > systems have response probablilities og 0.5, which is not likely what
>> you
>> > want.
>> >
>> > This all suggests that you are struggling with the interpretation of the
>> > regression coefficients and their role in the linear predictor. This
>> should
>> > be covered by any good book on logistic regression.
>> >
>> > -pd
>> >
>> > > On 12 Nov 2018, at 14:15 , Eik Vettorazzi <E.Vettorazzi using uke.de>
>> wrote:
>> > >
>> > > Dear Jedi,
>> > > please use the source carefully. A and C are not statistically
>> different
>> > at the 5% level, which can be inferred from glm output. Your last two
>> > wald.tests don't test what you want to, since your model contains an
>> > intercept term. You specified contrasts which tests A vs B-A, ie A-
>> > (B-A)==0 <-> 2*A-B==0 which is not intended I think. Have a look at
>> > ?contr.treatment and re-read your source doc to get an idea what dummy
>> > coding and indicatr variables are about.
>> > >
>> > > Cheers
>> > >
>> > >
>> > > Am 12.11.2018 um 02:07 schrieb Frodo Jedi:
>> > >> Dear list members,
>> > >> I need some help in understanding whether I am doing correctly a
>> > binomial
>> > >> logistic regression and whether I am interpreting the results in the
>> > >> correct way. Also I would need an advice regarding the reporting of
>> the
>> > >> results from the R functions.
>> > >> I want to report the results of a binomial logistic regression where
>> I
>> > want
>> > >> to assess difference between the 3 levels of a factor (called
>> System) on
>> > >> the dependent variable (called Response) taking two values, 0 and 1.
>> My
>> > >> goal is to understand if the effect of the 3 systems (A,B,C) in
>> System
>> > >> affect differently Response in a significant way. I am basing my
>> > analysis
>> > >> on this URL: https://stats.idre.ucla.edu/r/dae/logit-regression/
>> > >> This is the result of my analysis:
>> > >>> fit <- glm(Response ~ System, data = scrd, family = "binomial")
>> > >>> summary(fit)
>> > >> Call:
>> > >> glm(formula = Response ~ System, family = "binomial", data = scrd)
>> > >> Deviance Residuals:
>> > >>     Min       1Q   Median       3Q      Max
>> > >> -2.8840   0.1775   0.2712   0.2712   0.5008
>> > >> Coefficients:
>> > >>              Estimate Std. Error z value Pr(>|z|)
>> > >> (Intercept)    3.2844     0.2825  11.626  < 2e-16 ***
>> > >> SystemB  -1.2715     0.3379  -3.763 0.000168 ***
>> > >> SystemC    0.8588     0.4990   1.721 0.085266 .
>> > >> ---
>> > >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> > >> (Dispersion parameter for binomial family taken to be 1)
>> > >>     Null deviance: 411.26  on 1023  degrees of freedom
>> > >> Residual deviance: 376.76  on 1021  degrees of freedom
>> > >> AIC: 382.76
>> > >> Number of Fisher Scoring iterations: 6
>> > >> Following this analysis I perform the wald test in order to
>> understand
>> > >> whether there is an overall effect of System:
>> > >> library(aod)
>> > >>> wald.test(b = coef(fit), Sigma = vcov(fit), Terms = 1:3)
>> > >> Wald test:
>> > >> ----------
>> > >> Chi-squared test:
>> > >> X2 = 354.6, df = 3, P(> X2) = 0.0
>> > >> The chi-squared test statistic of 354.6, with 3 degrees of freedom is
>> > >> associated with a p-value < 0.001 indicating that the overall effect
>> of
>> > >> System is statistically significant.
>> > >> Now I check whether there are differences between the coefficients
>> using
>> > >> again the wald test:
>> > >> # Here difference between system B and C:
>> > >>> l <- cbind(0, 1, -1)
>> > >>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
>> > >> Wald test:
>> > >> ----------
>> > >> Chi-squared test:
>> > >> X2 = 22.3, df = 1, P(> X2) = 2.3e-06
>> > >> # Here difference between system A and C:
>> > >>> l <- cbind(1, 0, -1)
>> > >>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
>> > >> Wald test:
>> > >> ----------
>> > >> Chi-squared test:
>> > >> X2 = 12.0, df = 1, P(> X2) = 0.00052
>> > >> # Here difference between system A and B:
>> > >>> l <- cbind(1, -1, 0)
>> > >>> wald.test(b = coef(fit), Sigma = vcov(fit), L = l)
>> > >> Wald test:
>> > >> ----------
>> > >> Chi-squared test:
>> > >> X2 = 58.7, df = 1, P(> X2) = 1.8e-14
>> > >> My understanding is that from this analysis I can state that the
>> three
>> > >> systems lead to a significantly different Response. Am I right? If
>> so,
>> > how
>> > >> should I report the results of this analysis? What is the correct
>> way?
>> > >> Thanks in advance
>> > >> Best wishes
>> > >> FJ
>> > >>      [[alternative HTML version deleted]]
>> > >> ______________________________________________
>> > >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > >> https://stat.ethz.ch/mailman/listinfo/r-help
>> > >> PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > >> and provide commented, minimal, self-contained, reproducible code.
>> > >
>> > > --
>> > > Eik Vettorazzi
>> > >
>> > > Department of Medical Biometry and Epidemiology
>> > > University Medical Center Hamburg-Eppendorf
>> > >
>> > > Martinistrasse 52
>> > > building W 34
>> > > 20246 Hamburg
>> > >
>> > > Phone: +49 (0) 40 7410 - 58243
>> > > Fax:   +49 (0) 40 7410 - 57790
>> > > Web: www.uke.de/imbe
>> > > --
>> > >
>> > > _____________________________________________________________________
>> > >
>> > > Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
>> > Rechts; Gerichtsstand: Hamburg | www.uke.de
>> > > Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr.
>> > Dr. Uwe Koch-Gromus, Joachim Prölß, Marya Verdel
>> > > _____________________________________________________________________
>> > >
>> > > SAVE PAPER - THINK BEFORE PRINTING
>> > > ______________________________________________
>> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > > and provide commented, minimal, self-contained, reproducible code.
>> >
>> > --
>> > Peter Dalgaard, Professor,
>> > Center for Statistics, Copenhagen Business School
>> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> > Phone: (+45)38153501
>> > Office: A 4.23
>> > Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list