[R] Homogeneity of variance tests between more than 2 sample

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Sun Dec 19 22:46:09 CET 2004


On 19-Dec-04 Landini Massimiliano wrote:
> Dear all
> a couple of months ago i've found threads regard test that
> verify AnOVa assumption on homogeneity of variances.  Prof.
> Ripley advice LDA / QDA procedures, many books (and many
> proprietary programs) advice Hartley's F_max, Cochran's
> minimum/maximum variance ratio (only balanced experiments),
> K^2 Bartlett's test, Levene's test.
> 
> Morton B. Brown and  Alan B. Forsythe in a 1974 article wrote
> about "Robust test for the equality of variances" (editet by
> Journal of the American Statistical Association Vol. 69, pp.:
> 364-367) "...the common F-ratio and BartlettÂ’s test are very
> sensitive to the assumption that the underlying populations
> are from a Gaussian distribution. When the underlying distributions
> are nonnormal, these tests can have an actual size several times
> larger than their nominal level of significance...."
> 
> Peter Armitage in  Statistical Methods in Medical Research
> ( Blackwell Scientific Publication, 1971, page. 212)
> "...Bartlett's > test maybe is less useful than it seems; motif
> are two: first F test is very sensitive to the nonnormality;
> second, in samples with few data, true variances must differ
> in considerable manner before there is a wise/reasonable probability
> to obtain results significant. In other word, even if M/C ratio
> is NOT significant, estimated  variances and true variances can
> differ in substantial manner. If eventually differences in true
> variances had weight in further analysis, is more clever admit
> differences, even if tests give a non significant result..."
> 
> So, I'm asking at gurus which is best behaviour, which test they use or
> teach.

It's true that Bartlett's test tends to be a better test of
normality of distribution than of homogeneity of variance.

It's also true that with small numbers of data these tests
are not powerful (though in that case you cannot hope for
much anyway).

For non-normal data, there's something of a question as to
what is meant (or, perhaps more accurately, what is intended
to be meant) by homogeneity of variance, as a test preliminary
to an analysis of variance.

It is possible to consider distribution-free approaches to
this mind of question.

One of Tukey's sneakiest inventions was the application of
the Mann-Whitney test (usually seen as a test of identity
of distribution against location-shift types of alternative,
more accurately against alternatives like "P(X<u) > P(Y<u)")
to test similarity of dispersion.

The trick: given X1 , ... , Xm and Y1 , ... , Yn, pool them
and sort the result as Z1 < Z2 < ... < ZN where N = m + n.

Now take the Z's in the order

  Z[1] , Z[N] , Z[2] , Z[N-1] , Z[3] , Z[N-2] , ....

i.e. work inwards from the ends, alternately from each end.

Note, as you proceed, whether each Z is an X or a Y.
You thus get a sequence of Xs and Ys. Then sum the number
of pairs (X,Y) in this sequence where the X occurs earlier
than the Y.

This sum, under the null hypothesis of identity of distribution,
has the Mann-Whitney distribution (just like its usual version),
and it is sensitive to differences of dispersion (e.g. if the
distribution of X is more dispersed than the distribution of Y,
then the Xs will be found earlier in the sequence since they
lie further out than the Ys and so will be counted in first
by the above method).

No doubt, just as there are distribution-free extensions of
procedures like Mann-Whitney to several samples ("nonparametric
ANOVA"), so such a procedure could be applied to test equality
of "dispersions" for several samples, and no doubt it has been
done.

However, I've not made use of such a procedure myself, so I
have to leave it to others to report details.

Best wishes,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 19-Dec-04                                       Time: 21:46:09
------------------------------ XFMail ------------------------------




More information about the R-help mailing list