[R] unbalanced design in multifactor anova....

akshay kulkarni @k@h@y_e4 @end|ng |rom hotm@||@com
Sat Jan 22 12:56:14 CET 2022

dear members,
                         Thanks Peter, Bert, Rolf and Terry. Regrets to reply this late.

iF you say that balancedness is not required for lm, I think there is some inconsistency. The coded vectors can still be correlated and unless lm handles it with some trick, aov and lm are essentially the same. I even came to know that aov calls lm internally! Moroevr, can you please give me  a reference that explains how exactly lm handles semipartial correlations (if at all it does). I am just curious.

Moroever, If I drop some elements to make my data balanced, will the results (from aov and lm) be reliable, i.e makes the correct inference about my sample?

THanking You,
AkshaY Kulkarni

From: peter dalgaard <pdalgd using gmail.com>
Sent: Tuesday, January 18, 2022 6:13 PM
To: akshay kulkarni <akshay_e4 using hotmail.com>
Cc: R help Mailing list <r-help using r-project.org>
Subject: Re: [R] unbalanced design in multifactor anova....

In brief, aov() requires balancedness (or at least you _really_ need to know what you are doing otherwise), lm() does not, but you need to be careful that results, like in any multiple regression, depends on test order. For models with random effects, things get tricky and you likely need to use the "lme4" package.

- Peter D.

> On 18 Jan 2022, at 08:14 , akshay kulkarni <akshay_e4 using hotmail.com> wrote:
> dear members,
>                         I have a question on anova as implemented in R.
> If there is an unbalanced design in multifactor anova, will aov or lm work properly? I was reading a book on excel where the author points that in an unbalanced design, the factors, as coded vectors, are correlated. He says that variance will be allocated properly only when the coded vectors are uncorrelated. But he also justifies that the function TREND() in Excel handles this automatically using semipartial correlations.
> What about aov or lm in R, which are used to implement anova? Should we do some thing extra for them to work properly in an unbalanced design? Or will the coding system used by R to represent the factors or levels internally handles the correlation?
> THanking you,
> Yours sincerely,
>        [[alternative HTML version deleted]]
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com

	[[alternative HTML version deleted]]

More information about the R-help mailing list