[R] H0 and H1 probabilities in Cohen's Effect Size w for X2 test

Sat Mar 10 02:00:46 CET 2007

Dear all,

I've been delighted to just notice that Cohen's formulas for 
Effect Size 'w' and the associated power have been implemented in 
the 'pwr' package (thanks to Stéphane Champely and others)..

There is one aspect, though, that perplexes me. I'm doing some last 
minute post hoc analyses, meaning that my sample size (N=3404) has 
been long fixed, and I'm interested in assessing the ES and Power 
after the fact..

As far as I can deduce from the implementation of the ES.w2 formula or 
Cohen's (1992) own article, it seems to me that the probabilities 
p(H0) and p(H1) would simply be the expected and observed absolute 
frequencies divided by the sample size N, in that the 'true' 
probablities are the observed proportions and the null probabilities 
the expected ones. If this is correct, then the effect size and the 
power statistics can naturally easily be calculated with the 'pwr' 
package. However, this entails that the noncentrality parameter 
lambda=N*w^2 is equal to the chi-squared statistic X^2.

> observed
     p   h   m    a
X 119  64  36   37
Y 594 323 776 1455

> expected
           p         h         m         a
X  53.62162  29.10458  61.06698  112.2068
Y 659.37838 357.89542 750.93302 1379.7932

> observed.p
            p          h          m          a
X 0.03495887 0.01880141 0.01057579 0.01086957
Y 0.17450059 0.09488837 0.22796710 0.42743831

> expected.p
            p           h          m          a
X 0.01575253 0.008550112 0.01793977 0.03296322
Y 0.19370693 0.105139664 0.22060312 0.40534465

> ES.w2(observed.p)
[1] 0.2406104

> ES.w1(expected.p,observed.p)
[1] 0.2406104

> pwr.chisq.test(w=ES.w1(expected.p,observed.p),N=3404,sig.level=.05, 
df=3)
      Chi squared power calculation

               w = 0.2406104
               N = 3404
              df = 3
       sig.level = 0.05
           power = 1

  NOTE: N is the number of observations

> lambda <- 3404*ES.w1(observed.p,expected.p)^2

> lambda
[1] 240.9289

> pchisq(qchisq(p=.05,df=3,lower.tail=F),ncp=lambda,df=3,lower=F)
[1] 1

Have I missed or misunderstood something here altogether? Should the 
alternative H0 probabilities be estimated by e.g. some sort of 
fitting? Any pointers, suggestions or assistance would be greatly 
appreciated.

 	-Antti Arppe
-- 
======================================================================
Antti Arppe - Master of Science (Engineering)
Researcher & doctoral student (Linguistics)
E-mail: antti.arppe at helsinki.fi
WWW: http://www.ling.helsinki.fi/~aarppe
----------------------------------------------------------------------
Work: Department of General Linguistics, University of Helsinki
Work address: P.O. Box 9 (Siltavuorenpenger 20 A)
    00014 University of Helsinki, Finland
Work telephone: +358 9 19129312 (int'l) 09-19129312 (in Finland)
Work telefax: +358 9 19129307 (int'l) 09-19129307 (in Finland)
----------------------------------------------------------------------
Private address: Fleminginkatu 25 E 91, 00500 Helsinki, Finland
Private telephone: +358 50 5909015 (int'l) 050-5909015 (in Finland)
----------------------------------------------------------------------