# [R] Lack of independence in anova()

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Thu Jul 7 12:18:09 CEST 2005

```My first reaction to Duncan's example was "Touché -- with apologies
to Göran for suspecting on over-trivial example"! I had not thought
long enough about possible cases. Duncan is right; and maybe it is
the same example as Göran was thinking of.

Regarding Spencer's argument below, in Duncan's statement he says
"Z is supported on +/- A" (i.e. Z = A or Z = -A),
so P(|Z| < 1) = 0 and so Spencer's 1-2z = 0 and z=1/2 (but Spencer
stipulates that Z is symmetric).

In general, suppose P(Z = A) = p > 0 and P(Z = -A) = q = 1-p.

Since X and Y are symmetric, X/A has the same distribution
as X/(-A) and similarly for Y; hence for any v and w,
P(X/Z <= v | X = z) is independent of z = +/- A, therefore
= P(X/Z <= v); and similarly for Y.

Also X/A, Y/A are independent, and so are X/(-A) and Y/(-A).

Hence P(X/Z <= v and Y/Z <= w)

= p*P(X/Z <= v | Z = A)*P(Y/Z <= w | Z = A)

+ q*P(X/Z <= v | Z = -A)*P(Y/Z <= w | Z = -A)

= (p + q)*P(X/Z <= v)*P(Y/Z <= w)

= P(X/Z <= v)*P(Y/Z <= w)

so X/Z and Y/Z are independent.

However, interesting though it maybe, this is a side-issue
to the original question concerning independence of the F-ratios
in an ANOVA. Here, numerators and denominator are all positive,
so examples like the above are not relevant.

The original argument (that increasing Z diminishes both X/Z
and Y/Z simultaneously) applies; but it is also possible to
demonstrate analytically that P(X/Z <= v and Y/Z <= w) is
greater than P(X/Z <= v)*P(Y/Z <= w).

The original issue also was that, in R, there might be a bug
in anova(). However, one can, in R and independently of the
behaviour of anova(), demonstrate this positive correlation:

C<-numeric(10000);
for(i in (1:10000)){
X<-rchisq(1000,5)/5
Y<-rchisq(1000,5)/5
Z<-rchisq(1000,20)/20
C[i]<-cor(X/Z,Y/Z)
}
hist(C)

which shows that all 10000 correlations are positive.

Best wishes to all,
Ted.

On 07-Jul-05 Spencer Graves wrote:
> Hi, Duncan & Göran:
>
>         Consider the following:  X, Y, Z have symmetric distributions
with
> the following restrictions:
>
>         P(X=1)=P(X=-1)=x with P(|X|<1)=0 so P(|X|>1)=1-2x.
>         P(Y=1)=P(Y=-1)=y with P(|Y|<1)=0 so P(|Y|>1)=1-2y.
>         P(Z=1)=P(Z=-1)=z with P(|Z|>1)=0 so P(|Z|<1)=1-2z.
>
>         Then
>
>         P(X/Z=1)=2xz, P(Y/Z=1)=2yz, and
>         P{(X/Z=1)&(Y/Z)=1}=2xyz.
>
>         Independence requires that this last probability is 4xyz^2.
This is
> true only if z=0.5.  If z<0.5, then X/Z and Y/Z are clearly dependent.
>
>         How's this?
>         spencer graves
>
> Duncan Murdoch wrote:
>
>> (Ted Harding) wrote:
>>
>>>On 06-Jul-05 Göran Broström wrote:
>>>
>>>
>>>>On Wed, Jul 06, 2005 at 10:06:45AM -0700, Thomas Lumley wrote:
>>>>(...)
>>>>
>>>>
>>>>>If X, Y, and Z are independent and Z takes on more than one
>>>>>value then X/Z and Y/Z can't be independent.
>>>>
>>>>Not really true. I  can produce a counterexample on request
>>>>
>>>>Göran Broström
>>>
>>>
>>>But true if both X  and Y have positive probability of being
>>>non-zero, n'est-pas?
>>>
>>>Tut, tut, Göran!
>>
>>
>> If X and Y are independent with symmetric distributions about zero,
>> and
>> Z is is supported on +/- A for some non-zero constant A, then X/Z and
>> Y/Z are still independent.  There are probably other special cases
>> too.
>>
>> Duncan Murdoch
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> http://www.R-project.org/posting-guide.html
>
> --
> Spencer Graves, PhD
> Senior Development Engineer
> PDF Solutions, Inc.
> 333 West San Carlos Street Suite 700
> San Jose, CA 95110, USA
>
> spencer.graves at pdf.com
> www.pdf.com <http://www.pdf.com>
> Tel:  408-938-4420
> Fax: 408-280-7915
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help