# [R] Lack of independence in anova()

Phillip Good pigood at verizon.net
Mon Jul 4 17:28:45 CEST 2005

If the observations are normally distributed and the 2xk design is
balanced,  theory requires that the tests for interaction and row effects be
independent.  In my program, appended below, this would translate to cntT
(approx)= cntR*cntI/N if all R routines were functioning correctly.  They
aren't.

sim2=function(size,N,p){
cntR=0
cntC=0
cntI=0
cntT=0
cntP=0
for(i in 1:N){
#generate data
v=gendata(size)
#analyze after build(ing) design containing data
lm.out=lm(yield~c*r,build(size,v))
av.out=anova(lm.out)
#if column effect is significant, increment cntC
if (av.out[[5]][1]<=p)cntC=cntC+1
#if row effect is significant, increment cntR
if (av.out[[5]][2]<=p){
cntR=cntR+1
tmp = 1
}
else tmp =0
if (av.out[[5]][3]<=p){
#if interaction is significant, increment cntI
cntI=cntI+1
#if both interaction and row effect are significant, increment cntT
cntT=cntT + tmp
}
}
list(cntC=cntC, cntR=cntR, cntI=cntI, cntT=cntT)
}

build=function(size,v){
#size is a vector containing the sample sizes
col=c(rep(0,size[1]),rep(1,size[2]),rep(2,size[3]),rep(3,size[4]),
rep(0,size[5]),rep(1,size[6]),rep(2,size[7]),rep(3,size[8]))
row=c(rep(0,size[1]+size[2]+size[3]+size[4]),rep(1,size[5]+size[6]
+size[7]+size[8]))
return(data.frame(c=factor(col), r=factor(row),yield=v))
}

gendata=function(size){
ssize=sum(size);
return (rnorm(ssize))
}

#Example
size=c(3,3,3,0,3,3,3,0)
sim2(size,10000,10,.16)

Phillip Good
Huntington Beach CA