[R] Interpreting summary.lm for a 2 factor anova

Ashim Kapoor ashimkapoor at gmail.com
Sun Dec 4 06:44:26 CET 2016


Dear Sir,

Alright.

Best Regards,
Ashim

On Sun, Dec 4, 2016 at 10:59 AM, Richard M. Heiberger <rmh at temple.edu>
wrote:

> As Petr Pikal mentioned, the difficulty in interpretation is entirely due
> to the set of contrasts you chose.The default treatment contrasts are
> not orthogonal and are therefore the most difficult to interpret.
> The note in ?aov warns of this difficulty.
>
> sum contrasts will give you numbers that are easiest to interpret.
>
>
> options(contrasts = c("contr.sum", "contr.poly"))
> warpbreakssum.aov <- aov(breaks ~ wool * tension, data = warpbreaks)
> coef(warpbreakssum.aov)
> model.tables(warpbreakstreatment.aov, type="effects")
> model.tables(warpbreakstreatment.aov, type="means")
>
>
> John Fox showed the algebra using the default treatment contrasts
>
> For full understanding you will need to read  in a text more about
> sets of linear contrasts and their algebra.
> I recommend Section 10.3 in mine, of course.
>
> Statistical Analysis and Data Display:
> An Intermediate Course with Examples in R
> Heiberger, Richard M., Holland, Burt
>
> http://www.springer.com/us/book/9781493921218
>
> On Sat, Dec 3, 2016 at 11:46 PM, Ashim Kapoor <ashimkapoor at gmail.com>
> wrote:
> > On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoor <ashimkapoor at gmail.com>
> wrote:
> >
> >> Dear Sir,
> >>
> >> Many thanks for the explanation. Prior to your email (with some help
> from
> >> a friend of mine) I was able to figure this one out. If we look at the
> >> model : -
> >>
> >> y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4.
> woolB.TensionM
> >> + B5.woolB.TensionH + error
> >>
> >> Here woolB, tensionM, tensionH are the dummy indicator variables similar
> >> to how you have defined them.
> >>
> >> Now suppose we consider y1,..,yn, all in group A.L (say).
> >>
> >> Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 +
> 0 +
> >> 0 + 0 + 0.
> >>
> >> This should be : y1 + ... yn = n . intercept
> >
> > What was confusing me was how to compute the cell mean in woolB,tensionH
> >> cell.
> >>
> >> If we have y_1,...,y_n all in group B.H then :-
> >>
> >> y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 +  B5
> >>
> >> This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 +  B5 )
> >
> >
> >> Therefore average of group B.H = intercept + B1 + B3 + B5
> >>
> >> Many thanks and Best Regards,
> >> Ashim
> >>
> >>
> >>
> >> On Sat, Dec 3, 2016 at 7:15 PM, Fox, John <jfox at mcmaster.ca> wrote:
> >>
> >>> Dear Ashim,
> >>>
> >>> Sorry to chime in late, and my apologies if someone has already pointed
> >>> this out, but here's the relationship between the cell means and the
> model
> >>> coefficients, using the row-basis of the model matrix:
> >>>
> >>> -------------------------- snip ------------------------
> >>>
> >>> > means <- with( warpbreaks, tapply( breaks, interaction(wool,
> tension),
> >>> mean ) )
> >>> > x.A <- rep(c(0, 1), 3)
> >>> > x.B1 <- rep(c(0, 1, 0), each=2)
> >>> > x.B2 <- rep(c(0, 0, 1), each=2)
> >>> > x.AB1 <- x.A*x.B1
> >>> > x.AB2 <- x.A*x.B2
> >>> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
> >>> > X.basis
> >>>        x.A x.B1 x.B2 x.AB1 x.AB2
> >>> [1,] 1   0    0    0     0     0
> >>> [2,] 1   1    0    0     0     0
> >>> [3,] 1   0    1    0     0     0
> >>> [4,] 1   1    1    0     1     0
> >>> [5,] 1   0    0    1     0     0
> >>> [6,] 1   1    0    1     0     1
> >>> > solve(X.basis, means)
> >>>                 x.A      x.B1      x.B2     x.AB1     x.AB2
> >>>  44.55556 -16.33333 -20.55556 -20.00000  21.11111  10.55556
> >>> > coef(aov(breaks ~ wool * tension, data = warpbreaks))
> >>>    (Intercept)          woolB       tensionM       tensionH
> woolB:tensionM
> >>>       44.55556      -16.33333      -20.55556      -20.00000
>  21.11111
> >>> woolB:tensionH
> >>>       10.55556
> >>>
> >>> -------------------------- snip ------------------------
> >>>
> >>> I hope this helps,
> >>>  John
> >>>
> >>> -----------------------------
> >>> John Fox, Professor
> >>> McMaster University
> >>> Hamilton, Ontario
> >>> Canada L8S 4M4
> >>> Web: socserv.mcmaster.ca/jfox
> >>>
> >>>
> >>>
> >>> > -----Original Message-----
> >>> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
> Ashim
> >>> Kapoor
> >>> > Sent: December 3, 2016 12:19 AM
> >>> > To: David Winsemius <dwinsemius at comcast.net>
> >>> > Cc: r-help at r-project.org
> >>> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
> >>> >
> >>> > Please allow me to rephrase myquery.
> >>> >
> >>> > > model.tables(model,"m")
> >>> > Tables of means
> >>> > Grand mean
> >>> >
> >>> > 28.14815
> >>> >
> >>> >  wool
> >>> > wool
> >>> >      A      B
> >>> > 31.037 25.259
> >>> >
> >>> >  tension
> >>> > tension
> >>> >     L     M     H
> >>> > 36.39 26.39 21.67
> >>> >
> >>> >  wool:tension
> >>> >     tension
> >>> > wool L     M     H
> >>> >    A 44.56 24.00 24.56
> >>> >    B 28.22 28.78 18.78
> >>> > >
> >>> >
> >>> >
> >>> > The above is the same as :
> >>> >
> >>> > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean )
> )
> >>> >      A.L      B.L      A.M      B.M      A.H      B.H
> >>> > 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
> >>> >
> >>> > For reference:
> >>> >
> >>> > > model <- aov(breaks ~ wool * tension, data = warpbreaks)
> >>> > > summary.lm(model)
> >>> >
> >>> > Call:
> >>> > aov(formula = breaks ~ wool * tension, data = warpbreaks)
> >>> >
> >>> > Residuals:
> >>> >      Min       1Q   Median       3Q      Max
> >>> > -19.5556  -6.8889  -0.6667   7.1944  25.4444
> >>> >
> >>> > Coefficients:
> >>> >                Estimate Std. Error t value Pr(>|t|)
> >>> > (Intercept)      44.556      3.647  12.218 2.43e-16 ***
> >>> > woolB           -16.333      5.157  -3.167 0.002677 **
> >>> > tensionM        -20.556      5.157  -3.986 0.000228 ***
> >>> > tensionH        -20.000      5.157  -3.878 0.000320 ***
> >>> > woolB:tensionM   21.111      7.294   2.895 0.005698 **
> >>> > woolB:tensionH   10.556      7.294   1.447 0.154327
> >>> > ---
> >>> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>> >
> >>> > Residual standard error: 10.94 on 48 degrees of freedom
> >>> > Multiple R-squared:  0.3778,    Adjusted R-squared:  0.3129
> >>> > F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
> >>> >
> >>> >
> >>> > Now I'll explain what is confusing me in the output of summary.lm.
> >>> >
> >>> > Coeff of Intercept = 44.556  = cell mean for A.L. This is the base.
> >>> >
> >>> > Coeff of woolB:L = -16.333 = 28.22222 - 44.556. This is the
> difference
> >>> of this
> >>> > cell mean(B:L) from the base.
> >>> >
> >>> > Coeff of woolA:tensionM = -20.556  = 24.000- 44.556. This is the
> >>> difference of
> >>> > this cell mean (A:M)  from the base.
> >>> >
> >>> > Coeff of woolA:tensionH = -20.000  = 24.55556 - 44.556. This is the
> >>> difference
> >>> > of this cell mean(A:H) from the base.
> >>> >
> >>> > This is where it stops being the difference from the base.
> >>> >
> >>> > Coeff of woolB:tensionM = 21.111 should turn out to be 28.77778 -
> >>> 44.556 but
> >>> > this is -15.77822
> >>> >
> >>> > Coeff of woolB:tensionH = 10.556 should turn out to be  18.77778 -
> >>> 44.556 but
> >>> > this is -25.77822
> >>> >
> >>> > In the above 2 cases, we can't say that the coefficient = cell mean -
> >>> base case.
> >>> > Can you tell me what should be the statement to be made ?
> >>> >
> >>> >
> >>> > Best Regards,
> >>> > Ashim
> >>> >
> >>> > PS : My apologies for emailing my query to this list. Can you tell me
> >>> the names
> >>> > of a few (active) statistics help list ?
> >>> >
> >>> > On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius <
> dwinsemius at comcast.net
> >>> >
> >>> > wrote:
> >>> >
> >>> > >
> >>> > > > On Dec 2, 2016, at 9:09 AM, David Winsemius <
> dwinsemius at comcast.net
> >>> >
> >>> > > wrote:
> >>> > > >
> >>> > > >>
> >>> > > >> On Dec 2, 2016, at 6:16 AM, Ashim Kapoor <ashimkapoor at gmail.com
> >
> >>> > wrote:
> >>> > > >>
> >>> > > >> Dear Pikal,
> >>> > > >>
> >>> > > >> All levels except the interactions are compared to the
> Intercept.
> >>> > > >> I'm a little confused as to what's going on in interaction terms
> >>> > > >> eg. the cell wool B : tension M. It's mean is :
> >>> > > >> 28.78 and 28.78 - 44.56 = -15.78 != 21.111.
> >>> > > >>
> >>> > > >> It's something like 44.56 (intercept) -16.333 (wool B) -.20.556
> >>> > > >> (tension
> >>> > > >> M)  + 21.111 (woolB:tensionM) = 28.782.
> >>> > > >>
> >>> > > >> I don't know how to sum up the above line in terms of
> differences
> >>> > > >> succinctly.
> >>> > > >
> >>> > > > The aov estimate will not exactly equal the observed mean (this
> is
> >>> > > _statistics_ after all). You should be comparing the mean of that
> cell
> >>> > > to the estimate:
> >>> > > >
> >>> > > > 44.556 + (-16.33) +(-20.556) + (21.11)
> >>> > >
> >>> > > A respected participant advised me to look at this more closely. In
> >>> > > this case (and I think in most such cases)  where there are the
> same
> >>> > > number of parameters as there are means, the model is "saturated"
> and
> >>> > > there is no
> >>> > > difference:
> >>> > >
> >>> > >  with( warpbreaks, tapply( breaks, interaction(wool, tension),
> mean )
> >>> )
> >>> > >      A.L      B.L      A.M      B.M      A.H      B.H
> >>> > > 44.55556 28.22222 24.00000 28.77778 24.55556 18.77778
> >>> > >
> >>> > > So the B:M estimate is identical up to rounding with the observed
> >>> mean:
> >>> > >
> >>> > >  44.556 + (-16.33) +(-20.556) + (21.11) [1] 28.78
> >>> > >
> >>> > >
> >>> > >
> >>> > > >
> >>> > > > The difference between the observed mean and the estimated mean
> is
> >>> > known
> >>> > > as a 'residual'
> >>> > >
> >>> > > I've also been privately but gently chided for this misstatement.
> >>> > > Residuals are the difference between data and estimates.
> >>> > >
> >>> > > > and the squared sum of the all residuals is what this being
> >>> minimized
> >>> > > ... over all the cells including the one implicitly associated with
> >>> the
> >>> > > Intercept.
> >>> > > >
> >>> > > > This isn't really on-topic for Rhelp since you are not having
> >>> difficulty
> >>> > > in getting the R program to perform its duties, but are rather in
> >>> need of
> >>> > > statistical education. That not what this mailing list is set up
> for.
> >>> > > >
> >>> > > > --
> >>> > > > David.
> >>> > > >
> >>> > > >>
> >>> > > >>>
> >>> > > >>>> -----Original Message-----
> >>> > > >>>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf
> Of
> >>> Ashim
> >>> > > >>>> Kapoor
> >>> > > >>>> Sent: Thursday, December 1, 2016 2:48 PM
> >>> > > >>>> To: r-help at r-project.org
> >>> > > >>>> Subject: [R] Interpreting summary.lm for a 2 factor anova
> >>> > > >>>>
> >>> > > >>>> Dear all,
> >>> > > >>>>
> >>> > > >>>> Here is a small example : -
> >>> > > >>>>
> >>> > > >>>>> model <- aov(breaks ~ wool * tension, data = warpbreaks)
> >>> > > >>>>> summary.lm(model)
> >>> > > >>>>
> >>> > > >>>> Call:
> >>> > > >>>> aov(formula = breaks ~ wool * tension, data = warpbreaks)
> >>> > > >>>>
> >>> > > >>>> Residuals:
> >>> > > >>>>    Min       1Q   Median       3Q      Max
> >>> > > >>>> -19.5556  -6.8889  -0.6667   7.1944  25.4444
> >>> > > >>>>
> >>> > > >>>> Coefficients:
> >>> > > >>>>              Estimate Std. Error t value Pr(>|t|)
> >>> > > >>>> (Intercept)      44.556      3.647  12.218 2.43e-16 ***
> >>> > > >>>> woolB           -16.333      5.157  -3.167 0.002677 **
> >>> > > >>>> tensionM        -20.556      5.157  -3.986 0.000228 ***
> >>> > > >>>> tensionH        -20.000      5.157  -3.878 0.000320 ***
> >>> > > >>>> woolB:tensionM   21.111      7.294   2.895 0.005698 **
> >>> > > >>>> woolB:tensionH   10.556      7.294   1.447 0.154327
> >>> > > >>>> ---
> >>> > > >>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>> > > >>>>
> >>> > > >>>> Residual standard error: 10.94 on 48 degrees of freedom
> >>> > > >>>> Multiple R-squared:  0.3778,    Adjusted R-squared:  0.3129
> >>> > > >>>> F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
> >>> > > >>>>
> >>> > > >>>>> model.tables(model,"e")
> >>> > > >>>> Tables of effects
> >>> > > >>>>
> >>> > > >>>> wool
> >>> > > >>>> wool
> >>> > > >>>>     A       B
> >>> > > >>>> 2.8889 -2.8889
> >>> > > >>>>
> >>> > > >>>> tension
> >>> > > >>>> tension
> >>> > > >>>>    L      M      H
> >>> > > >>>> 8.241 -1.759 -6.481
> >>> > > >>>>
> >>> > > >>>> wool:tension
> >>> > > >>>>   tension
> >>> > > >>>> wool L      M      H
> >>> > > >>>>  A  5.278 -5.278  0.000
> >>> > > >>>>  B -5.278  5.278  0.000
> >>> > > >>>>
> >>> > > >>>>
> >>> > > >>>>> model.tables(model,"m")
> >>> > > >>>> Tables of means
> >>> > > >>>> Grand mean
> >>> > > >>>>
> >>> > > >>>> 28.14815
> >>> > > >>>>
> >>> > > >>>> wool
> >>> > > >>>> wool
> >>> > > >>>>    A      B
> >>> > > >>>> 31.037 25.259
> >>> > > >>>>
> >>> > > >>>> tension
> >>> > > >>>> tension
> >>> > > >>>>   L     M     H
> >>> > > >>>> 36.39 26.39 21.67
> >>> > > >>>>
> >>> > > >>>> wool:tension
> >>> > > >>>>   tension
> >>> > > >>>> wool L     M     H
> >>> > > >>>>  A 44.56 24.00 24.56
> >>> > > >>>>  B 28.22 28.78 18.78
> >>> > > >>>>>
> >>> > > >>>>
> >>> > > >>>> I don't follow the output of summary.lm. I understand the
> output
> >>> of
> >>> > > >>>> model.tables for effects and means. For instance what does
> 44.556
> >>> > > >>>> represent ? Is it the grand average ? The grand mean is
> >>> 28.14815. Can
> >>> > > >>>> someone help me understand the output of summary.lm ?
> >>> > > >>>>
> >>> > > >>>> Best Regards,
> >>> > > >>>> Ashim
> >>> > > >>>>
> >>> > > >>>>     [[alternative HTML version deleted]]
> >>> > > >>>>
> >>> > > >>>> ______________________________________________
> >>> > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
> >>> see
> >>> > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > > >>>> PLEASE do read the posting guide
> http://www.R-project.org/posti
> >>> ng-
> >>> > > >>>> guide.html
> >>> > > >>>> and provide commented, minimal, self-contained, reproducible
> >>> code.
> >>> > > >>>
> >>> > > >>> ________________________________
> >>> > > >>> Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> důvěrné
> >>> a jsou
> >>> > > >>> určeny pouze jeho adresátům.
> >>> > > >>> Jestliže jste obdržel(a) tento e-mail omylem, informujte
> laskavě
> >>> > > >>> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami
> a
> >>> jeho
> >>> > > kopie
> >>> > > >>> vymažte ze svého systému.
> >>> > > >>> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni
> >>> tento
> >>> > > email
> >>> > > >>> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> >>> > > >>> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou
> >>> > > modifikacemi
> >>> > > >>> či zpožděním přenosu e-mailu.
> >>> > > >>>
> >>> > > >>> V případě, že je tento e-mail součástí obchodního jednání:
> >>> > > >>> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o
> >>> uzavření
> >>> > > >>> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> >>> > > >>> - a obsahuje-li nabídku, je adresát oprávněn nabídku
> bezodkladně
> >>> > > přijmout;
> >>> > > >>> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze
> >>> strany
> >>> > > >>> příjemce s dodatkem či odchylkou.
> >>> > > >>> - trvá odesílatel na tom, že příslušná smlouva je uzavřena
> teprve
> >>> > > >>> výslovným dosažením shody na všech jejích náležitostech.
> >>> > > >>> - odesílatel tohoto emailu informuje, že není oprávněn
> uzavírat za
> >>> > > >>> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl
> >>> písemně
> >>> > > zmocněn
> >>> > > >>> nebo písemně pověřen a takové pověření nebo plná moc byly
> >>> adresátovi
> >>> > > tohoto
> >>> > > >>> emailu případně osobě, kterou adresát zastupuje, předloženy
> nebo
> >>> jejich
> >>> > > >>> existence je adresátovi či osobě jím zastoupené známá.
> >>> > > >>>
> >>> > > >>> This e-mail and any documents attached to it may be
> confidential
> >>> and
> >>> > > are
> >>> > > >>> intended only for its intended recipients.
> >>> > > >>> If you received this e-mail by mistake, please immediately
> inform
> >>> its
> >>> > > >>> sender. Delete the contents of this e-mail with all attachments
> >>> and its
> >>> > > >>> copies from your system.
> >>> > > >>> If you are not the intended recipient of this e-mail, you are
> not
> >>> > > >>> authorized to use, disseminate, copy or disclose this e-mail in
> >>> any
> >>> > > manner.
> >>> > > >>> The sender of this e-mail shall not be liable for any possible
> >>> damage
> >>> > > >>> caused by modifications of the e-mail or by delay with transfer
> >>> of the
> >>> > > >>> email.
> >>> > > >>>
> >>> > > >>> In case that this e-mail forms part of business dealings:
> >>> > > >>> - the sender reserves the right to end negotiations about
> entering
> >>> > > into a
> >>> > > >>> contract in any time, for any reason, and without stating any
> >>> > > reasoning.
> >>> > > >>> - if the e-mail contains an offer, the recipient is entitled to
> >>> > > >>> immediately accept such offer; The sender of this e-mail
> (offer)
> >>> > > excludes
> >>> > > >>> any acceptance of the offer on the part of the recipient
> >>> containing any
> >>> > > >>> amendment or variation.
> >>> > > >>> - the sender insists on that the respective contract is
> concluded
> >>> only
> >>> > > >>> upon an express mutual agreement on all its aspects.
> >>> > > >>> - the sender of this e-mail informs that he/she is not
> authorized
> >>> to
> >>> > > enter
> >>> > > >>> into any contracts on behalf of the company except for cases in
> >>> which
> >>> > > >>> he/she is expressly authorized to do so in writing, and such
> >>> > > authorization
> >>> > > >>> or power of attorney is submitted to the recipient or the
> person
> >>> > > >>> represented by the recipient, or the existence of such
> >>> authorization is
> >>> > > >>> known to the recipient of the person represented by the
> recipient.
> >>> > > >>>
> >>> > > >>
> >>> > > >>      [[alternative HTML version deleted]]
> >>> > > >>
> >>> > > >> ______________________________________________
> >>> > > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
> see
> >>> > > >> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > > >> PLEASE do read the posting guide http://www.R-project.org/
> >>> > > posting-guide.html
> >>> > > >> and provide commented, minimal, self-contained, reproducible
> code.
> >>> > > >
> >>> > > > David Winsemius
> >>> > > > Alameda, CA, USA
> >>> > > >
> >>> > > > ______________________________________________
> >>> > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
> see
> >>> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > > > PLEASE do read the posting guide http://www.R-project.org/
> >>> > > posting-guide.html
> >>> > > > and provide commented, minimal, self-contained, reproducible
> code.
> >>> > >
> >>> > > David Winsemius
> >>> > > Alameda, CA, USA
> >>> > >
> >>> > >
> >>> >
> >>> >       [[alternative HTML version deleted]]
> >>> >
> >>> > ______________________________________________
> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > PLEASE do read the posting guide http://www.R-project.org/posti
> >>> ng-guide.html
> >>> > and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>
> >>
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list