[R] unbalanced design in multifactor anova....

Rolf Turner r@turner @end|ng |rom @uck|@nd@@c@nz
Fri Jan 28 22:16:12 CET 2022


On Sat, 22 Jan 2022 11:56:14 +0000
akshay kulkarni <akshay_e4 using hotmail.com> wrote:

> dear members,
>                          Thanks Peter, Bert, Rolf and Terry. Regrets
> to reply this late.
> 
> iF you say that balancedness is not required for lm, I think there is
> some inconsistency. The coded vectors can still be correlated and
> unless lm handles it with some trick, aov and lm are essentially the
> same. I even came to know that aov calls lm internally! Moroevr, can
> you please give me  a reference that explains how exactly lm handles
> semipartial correlations (if at all it does).

It doesn't.  They don't need to be "handled".

> I am just curious.
> 
> Moroever, If I drop some elements to make my data balanced, will the
> results (from aov and lm) be reliable, i.e makes the correct
> inference about my sample?

Generally not a great idea.  Dropping data means that you are throwing
away information.  The conclusions from your analysis will of course, in
general, depend on what data you drop.

You are imagining/creating difficulties that do not actually exist.

(1) The predictors will indeed be correlated (non-orthogonal) if the
design is unbalanced.  So what?

(2) aov() produces output that relates to a sequence of hypothesis
tests.  If the design is unbalanced some of these tests are not
meaningful.  Actually some of them are not really meaningful (or at
least are confusing to interpret) even if the design *is* balanced.

(3) Decide what null hypothesis you wish to test, and what alternative
hypothesis you wish to test it against.  Fit the null (getting, say,
fit0) and the alternative model (getting, say, fit1), using lm().

Then do anova(fit0,fit1).  Simple as that.

The hard part is of course deciding on your hypotheses.  This requires
that you think!  And that you actually understand what problem you
are trying to solve.

Analysis of variance can be a subtle concept.  Some insight into the
subtleties might be obtained by reading Bill Venables' paper:

    http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276



More information about the R-help mailing list