[R] re form data for aov()?

Kingsford Jones kingsfordjones at gmail.com
Sun Mar 29 19:45:40 CEST 2009


Hi Dan,

You could kill two birds with one stone (i.e. learn about both linear
models and R at the same time) by using one of the many R-focused
modeling references.  Quite a few can be found in the contributed
documentation section of CRAN.  Here's one:

http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf

An answer to your question depends on which hypotheses you're
interested in testing.  The simplest is to test the overall effect of
Method on Bacterial.Counts with:

> f.aov <- aov(Bacterial.Counts ~ Method, data=d)

> summary(f.aov)
            Df Sum Sq Mean Sq F value   Pr(>F)
Method       3  29882    9961  7.0636 0.001111 **
Residuals   28  39484    1410
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Then if you want to do pairwise comparisons with a default family-wise
error rate of 95% you could use:


> TukeyHSD(f.aov)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Bacterial.Counts ~ Method, data = d)

$Method
                                 diff        lwr       upr     p adj
Antibacterial.Soap-Alcohol.Spray 55.0   3.735849 106.26415 0.0319648
Soap-Alcohol.Spray               68.5  17.235849 119.76415 0.0055672
Water-Alcohol.Spray              79.5  28.235849 130.76415 0.0012122
Soap-Antibacterial.Soap          13.5 -37.764151  64.76415 0.8886944
Water-Antibacterial.Soap         24.5 -26.764151  75.76415 0.5675942
Water-Soap                       11.0 -40.264151  62.26415 0.9355196


Another option is to use the lm function.

> f.lm <- lm(Bacterial.Counts ~ Method, data=d)
> summary(f.lm)

Call:
lm(formula = Bacterial.Counts ~ Method, data = d)

Residuals:
   Min     1Q Median     3Q    Max
-72.50 -20.87  -1.00  18.13 101.00

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)
(Intercept)                 37.50      13.28   2.825 0.008629 **
MethodAntibacterial.Soap    55.00      18.78   2.929 0.006686 **
MethodSoap                  68.50      18.78   3.648 0.001070 **
MethodWater                 79.50      18.78   4.234 0.000224 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 37.55 on 28 degrees of freedom
Multiple R-squared: 0.4308,     Adjusted R-squared: 0.3698
F-statistic: 7.064 on 3 and 28 DF,  p-value: 0.001111


With the default treatment contrasts, the results are tests for
differences of each method from Alcohol.Spray.  There are many other
options for setting up the design matrix to test hypotheses of
interest, and for making adjustments for multiple testing.

Hope that helps,

Kingsford Jones

On Sun, Mar 29, 2009 at 11:01 AM, Dan Kelley <kelley.dan at gmail.com> wrote:
>
> I'm trying to follow along in a text by Velleman and others, with the
> 'handwashing' example of anova.   I used read.table() to read the data, and
> now I have an object d (put below the dots here), with an entry Method that
> has possible values "Water", "Soap", etc.  What I can't figure out is how to
> take this d$Method and use it in an aov call.  I tried
>
> aov(Bacterial.Counts ~ Water + Soap + Antibacterial.Soap + Alcohol.Spray,
> data=d)
>
> but this fails.  Do I need to break d$Method up into columns for each
> category, with boolean entries?  Or is there a way to do this more cleanly?
>
> ... the data ...
>
> `d` <-
> structure(list(Bacterial.Counts = c(74L, 84L, 70L, 51L, 135L,
> 51L, 164L, 5L, 102L, 110L, 88L, 19L, 124L, 67L, 111L, 18L, 105L,
> 119L, 73L, 58L, 139L, 108L, 119L, 50L, 170L, 207L, 20L, 82L,
> 87L, 102L, 95L, 17L), Method = structure(c(4L, 3L, 2L, 1L, 4L,
> 3L, 2L, 1L, 4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L, 4L,
> 3L, 2L, 1L, 4L, 3L, 2L, 1L, 4L, 3L, 2L, 1L), .Label = c("Alcohol.Spray",
> "Antibacterial.Soap", "Soap", "Water"), class = "factor")), .Names =
> c("Bacterial.Counts",
> "Method"), class = "data.frame", row.names = c(NA, -32L))
>
> --
> View this message in context: http://www.nabble.com/reform-data-for-aov%28%29--tp22769850p22769850.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list