[R] boyplots nearly identical but still highly significant effect?

Mon Sep 20 15:09:51 CEST 2010

Jake Kami <jakejkami <at> gmail.com> writes:

> 
> dear list,
> 
> i am running a within-design ANOVA with 4 factors (4,4,2 and 2 levels each).
> the last one is a time factor comprising two different treatment timepoints.
> i fit a mixed-effects model using lme and apply the anova function to the
> outcome. according to this analysis, there are highly significant main
> effect on the first and the time factor. i then checked the boxplots for the
> two 4-level factors for each timepoint separately: there is a difference or
> barely 1 to 2 units; actually, the plots look pretty much alike. also, there
> is no notable interaction effect. i am really wondering how this high
> significance of the time factor can come up then because i can not see any
> huge difference between the timepoint for all of the remaining factors. i
> know this might be a very basic statistical question but assistance in every
> way will be appreciated.

  Hard to say too much without more details (it's not clear what
your random effects are; trying to fit random effects with fewer
than 5 or 6 levels is difficult, so unless your grouping/random
factor is another variable that you haven't told us about, it's
quite possible that lme is reporting a very small variance for
the random factor.  But that may be a bit tangential ...)

  A couple of possibilities -- 
(1) boxplots display descriptive, not inferential, statistics.  Especially
if your sample sizes are large, the differences between level means could
be large in terms of standard errors of the mean but small in terms
of population standard deviations.
(2) you do have an orthogonal design, and you do say you don't
see effects of interactions, but ... it's possible that some of
the 'non-significant' factors are explaining enough of the residual
variance that the difference attributable to the 'significant' factors
is larger than it appears from the marginal distributions.  One way
to check this would be to fit a model with only the 'non-significant' factors
and then examine the difference in the residuals between levels of
the 'significant' factors.
(3) are you using likelihood-ratio or F tests?  If the latter, and
if your sample sizes are small enough, the tests may be seriously
anticonservative (see Pinheiro and Bates 2000).

  But good for you for trying to make sense of your results rather
than just reporting them ...