[R] Distinct combinations for bootstrapping small sets

Marc Schwartz marc_schwartz at comcast.net
Tue Mar 6 17:59:51 CET 2007


On Tue, 2007-03-06 at 15:54 +0000, S Ellison wrote:
> Small data sets (6-12 values, or a similarly small number of groups)
> which don't look nice and symmetric are quite common in my field
> (analytical chemistry and biological variants thereof), and often
> contain outliers or at least stragglers that I cannot simply discard.
> One of the things I occasionally do when I want to see what different
> assumptions do to my confidence intervals is to run a quick
> nonparametric bootstrap, just to get a feel for how asymmetric the
> distribution of any estimates might be. At the moment, I'm also
> interested in doing that on some historical data to evaluate some
> proposed estimators for interlab studies.
> 
> boot() is pretty good, but it's obvious that with such small sets,
> there aren't really many distinct resampled combinations (eg 92378 for
> 10 data points). So I'm really resampling from quite a small
> population of possible bootstrap samples. Its surely more efficient to
> generate all the different (resampled) combinations of the data set,
> and use those and their frequencies to get things like the bootstrap
> variance exactly. At worst, that'll stop us fooling ourselves into
> thinking more replicates will get better info.
> 
> A lengthy dig around R-help and CRAN turned up a blank on generating
> distinct combinations with resampling, so I've written a couple of
> routines to generate the distinct combinations and their frequencies.
> (They work, though I wouldn't guarantee great efficiency). But if a
> chemist (me) can think of it, its pretty certain that a statistician
> already has. Before I spend hours polishing code, is there already
> something out there I've missed?  
> 
> Steve Ellison

Steve,

The phrase that you seem to be looking for is "permutation test".

If you use the following in R:


  RSiteSearch("{permutation test}", restrict = "functions")


that will lead you to some of the functions available.  

One CRAN package specifically, 'coin', has a permutation framework for a
variety of such tests.

HTH,

Marc Schwartz



More information about the R-help mailing list