[R] Kruskal-Wallace power calculations.

Greg Snow 538280 at gmail.com
Fri Apr 3 19:43:47 CEST 2015


Here is some sample code:

## Simulation function to create data, analyze it using
## kruskal.test, and return the p-value
## change rexp to change the simulation distribution

simfun <- function(means, k=length(means), n=rep(50,k)) {
  mydata <- lapply( seq_len(k), function(i) {
    rexp(n[i], 1) - 1 + means[i]
  })
  kruskal.test(mydata)$p.value
}

# simulate under the null to check proper sizing
B <- 10000
out1 <- replicate(B, simfun(rep(3,4)))
hist(out1)
mean( out1 <= 0.05 )
binom.test( sum(out1 <= 0.05), B, p=0.05)

### Now simulate for power

B <- 10000
out2 <- replicate(B, simfun( c(3,3,3.2,3.3)))
hist(out2)
mean( out2 <= 0.05 )
binom.test( sum(out2 <= 0.05), B, p=0.05 )

This simulates from a continuous exponential (skewed) and shifts to
get the means (shifted location is a common assumption, though not
required for the actual test).

On Thu, Apr 2, 2015 at 8:19 PM, Collin Lynch <cflynch at ncsu.edu> wrote:
> Thank you Jim, I did see those (though not my typo :) and am still
> pondering the warning about post-hoc analyses.
>
> The situation that I am in is that I have a set of individuals who
> have been assigned a course grade.  We have then clustered these
> individuals into about 50 communities using standard community
> detection algorithms with the goal of determining whether community
> membership affects one of their grades.  We are using the KW test as
> the grade data is strongly non-normal and my coauthors preferred KW as
> an alternative.
>
> The two issues that I am struggling with are: 1) whether the post-hoc
> power analysis would be useful; and 2) how to code the simulation
> studies that are described in:
> http://onlinelibrary.wiley.com/doi/10.1002/bimj.4710380510/abstract
>
>
> Problem #1 is of course beyond the scope of this e-mail list though I
> would welcome anyone's suggestions on that point.  I am not sure that
> I buy the arguments against it offered here:
>
> http://graphpad.com/support/faq/why-it-is-not-helpful-to-compute-the-power-of-an-experiment-to-detect-the-difference-actually-observed-why-is-post-hoc-power-analysis-futile/
>
> It seems that the rationale boils down to "you didn't find it so you
> couldn't find it" but that does not tell me how far off I was from the
> goal.  I am still perusing the articles the author cites however.
>
>
> With respect to question #2 I am trying to lay my hands on the article
> and did find this old r-help discussion:
> http://r.789695.n4.nabble.com/Power-of-Kruskal-Wallis-Test-td4671188.html
> however I am not sure how to adapt the simulation studies that it
> links to to my current problem.  The links it leads to focus on
> mixed-effects models.  This may be more of a pure stats question and
> not suited for this list but I thought I'd ask in the hopes that
> anyone had any more specific KW code or knew of a good tutorial for
> the right kinds of simulation studies.
>
>     Thank you,
>     Collin.
>
>
>
>
> On Thu, Apr 2, 2015 at 6:35 PM, Jim Lemon <drjimlemon at gmail.com> wrote:
>> Hi Collin,
>> Have a look at this:
>>
>> http://stats.stackexchange.com/questions/70643/power-analysis-for-kruskal-wallis-or-mann-whitney-u-test-using-r
>>
>> Although, thinking about it, this might have constituted your "perusal of
>> the literature".
>>
>> Plus it always looks better when you spell the names properly
>>
>> Jim
>>
>>
>> On Fri, Apr 3, 2015 at 2:23 AM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> wrote:
>>>
>>> Please stop... you are acting like a broken record, and are also posting
>>> in HTML format. Please read the Posting Guide and demonstrate that you have
>>> used a search engine on this topic before posting again.
>>>
>>> ---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>>> Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
>>> Go...
>>>                                       Live:   OO#.. Dead: OO#..  Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>>> rocks...1k
>>>
>>> ---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On April 2, 2015 7:25:20 AM PDT, Collin Lynch <cflynch at ncsu.edu> wrote:
>>> >Greetings, I am working on a project where we are applying the
>>> >Kruskal-Wallace test to some factor data to evaluate their correlation
>>> >with
>>> >existing grade data.  I know that the grade data is nonnormal therefore
>>> >we
>>> >cannot rely on ANOVA or a similar parametric test.  What I would like
>>> >to
>>> >find is a mechanism for making power calculations for the KW test given
>>> >the
>>> >nonparametric assumptions.  My perusal of the literature has suggested
>>> >that
>>> >a simulation would be the best method.
>>> >
>>> >Can anyone point me to good examples of such simulations for KW in R?
>>> >And
>>> >does anyone have a favourite package for generating simulated data or
>>> >conducting such tests?
>>> >
>>> >    Thank you,
>>> >    Collin.
>>> >
>>> >       [[alternative HTML version deleted]]
>>> >
>>> >______________________________________________
>>> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible code.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538280 at gmail.com



More information about the R-help mailing list