[R] aov and non-categorical variables

Liaw, Andy andy_liaw at merck.com
Thu Oct 16 03:37:59 CEST 2003


> From: Alexander Sirotkin [at Yahoo] [mailto:alex_s_42 at yahoo.com] 
> 
> Thanks. One more question, if you don't mind.
> 
> If  instead of aov(), I call lm() directly it fits a
> linear regression model and if it encounters
> categorical variable it does what needs to be done in
> this case - defines a new indicator variable for each
> level of categorical var.

What ANOVA table are you talking about; i.e., from which function?  There is
no anova() method for aov objects, so you will see identical result for
anova(fit) whether `fit' is fitted by direct call to lm() or aov().  The
difference is the the output of summary.  For an aov object, summary() just
prints the ANOVA table, which gives the same answer as anova().  For an lm
object, summary() prints the coefficients, se's and the associated t-tests
for each term (or contrasts, for categorical variables).  That's not ANOVA
table.

BTW (for the developers), the labels for the terms shown below (output of
summary.lm) look rather confusing:

> summary(fit2)

Call:
lm(formula = y ~ x + x2)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.2779 -0.7437  0.3228  0.7196  0.8628 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   0.8287     0.4990   1.661   0.1353  
x2           -0.6625     0.6817  -0.972   0.3596  
x3           -1.6203     0.6843  -2.368   0.0454 *
x2            0.5110     0.2150   2.377   0.0448 *
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 

Residual standard error: 0.9604 on 8 degrees of freedom
Multiple R-Squared: 0.5587,     Adjusted R-squared: 0.3933 
F-statistic: 3.377 on 3 and 8 DF,  p-value: 0.07493 

Notice `x2' appear twice!  Is there a better way to label the contrasts so
as to avoid this confusion?

Andy

 
> However, if I call aov() with the same data
> (categorical and numeric) I don't see all these
> indicator variables in the ANOVA table. It is unclear
> to me how the ANOVA table with lots of inidcator
> variables produced by lm() is transferred into the
> ANOVA table of aov().
> 
> Also, after you mention the Error() term in aov() I
> tried to find some explaination about it in R manuals,
> and did not find any. Do you know where the meaning of
> Error() in aov() is documented ?
> 
> Thanks.
> 
> --- kjetil at entelnet.bo wrote:
> > On 15 Oct 2003 at 9:32, Alexander Sirotkin [at
> > Yahoo] wrote:
> > 
> > > It is unclear to me how aov() handles
> > non-categorical
> > > variables.
> > 
> > aov is an interface to lm, so it can estimate every
> > model lm
> > can, the difference is that it produces the results
> > (summary)
> > in the classical way for anova.
> > 
> > > 
> > > I mean it works and produces results that I would
> > > expect, but I was under impression that ANOVA is
> > only
> > > defined for categorical variables.
> > > 
> > > In addition, help(aov) says that it "call to 'lm'
> > for
> > > each  stratum", which  I presume means that it
> > calls
> > > to lm() for every group of the categorical
> > variable,
> > 
> > No. With anova you can also define "error strata"
> > using
> > Error() as part of the formula, lm() cannot do that.
> > If you don't use 
> > Error() in the formula, lm() is called only once. 
> > 
> > Kjetil Halvorsen
> > 
> > > however I don't quite understand what this means
> > for
> > > non-categorical variable.
> > > 
> > > Thanks
> > > 
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > >
> >
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > 
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>




More information about the R-help mailing list