# [R] Bootstrapping Confidence Intervals for Medians

Tim Hesterberg timh at insightful.com
Wed Jan 31 04:11:31 CET 2007

```This is in followup to an earlier message about bootstrap confidence
intervals for the difference in two medians.  I'll touch on three
topics:
* bootstrap theory for one vs two samples
* special case of median
* bootstrap small samples

ONE VS TWO SAMPLES

Brian Ripley wrote (the complete posting is below):
>The 'standard' theory of bootstrap confidence intervals as implemented in
>e.g. package 'boot' is for a single-sample problem (and it would be
>pushing its justification very hard to use this for n=9).  But you have
>two samples, and haven't told us how you intend to bootstrap.  I guess you
>mean a stratified bootstrap, sampling with replacement independently from
>observations 1-4 and 5-9.  I don't know of theory for bootstrap confidence
>intervals from that scenario: do you?

Much of the bootstrap theory was developed for a single sample,
but carries over fairly easily to two samples or other stratified
sampling.  Davison and Hinkley do talk about stratified sampling
in their book, and Canty implemented that in the 'boot' package.
S+Resample also supports stratified sampling, with a front-end
function 'bootstrap2' to make the two-sample case easier.

SPECIAL CASE OF MEDIANS

>Beyond this, there are considerable problems with bootstrapping medians in
>small samples as the median is a non-smooth function of the data and the
>bootstrap samples take very few values.  See for example the galaxies
>dataset as discussed in MASS4.  For the stratified bootstrapping I
>referred to, there are only a handful of possible values of each of the
>medians and so the bootstrap distribution is a highly non-uniform one on a
>few values.  E.g.

Bootstrapping medians does present special difficulties.  Particularly
with a single sample and n odd, the bootstrap distribution is often a
very inaccurate estimate for the sampling distribution; it takes on
only a few values.

Curiously, however, bootstrap percentile intervals for a single median
are not bad.  They are similar to classical nonparametric intervals
based on order statistics, and use either the same or an adjacent
order statistic; where they differ, the classical interval is
conservative (one-sided non-coverage less than 0.025, often much
less), and the bootstrap interval is one order statistic narrower, with
closer to the nominal coverage.

BOOTSTRAPPING SMALL VS MEDIUM SAMPLES

People tend to think of bootstrapping for small samples, where they
don't trust the central limit theorem, like n=9.  That is misguided.
They should use the bootstrap in medium size samples, where
they shouldn't trust the central limit theorem, like perhaps n=20
(depending on the application) to n=1000.

For very small samples the data distribution does not reliably reflect
the shape of the population, and it is better to impose parametric
assumptions.  In other words, with very small samples you can't accurately
estimate skewness, so it may be better to use classical methods that just
assume that skewness is zero.

Conversely, for medium size samples one should not just assume that
skewness is zero, but instead use bootstrap or other methods that
allow for skewness.  Yes, n=1000 is a medium size sample -- the effect
of skewness on confidence intervals and tests decreases very slowly,
with size or coverage errors of O(1/sqrt(n)) for classical t tests and
confidence intervals.

One beauty of the bootstrap is that it provides a nice way to estimate
what size errors you make by assuming sampling distributions are normal --
by creating a bootstrap distribution and seeing how non-normal it is.
It provides a nice alternative to the pre-computer rule of using
normal methods if n >= 30.

Tim Hesterberg

========================================================
| Tim Hesterberg       Senior Research Scientist       |
| timh at insightful.com  Insightful Corp.                |
| (206)802-2319        1700 Westlake Ave. N, Suite 500 |
| (206)283-8691 (fax)  Seattle, WA 98109-3044, U.S.A.  |
|                      www.insightful.com/Hesterberg   |
========================================================
Bootstrap short courses:  May 21-22 Philadelphia, Oct 10-11 San Francisco.
2-3 May UK, 3-4 Oct UK.
(I do talk more about these issues in the short courses.)

Brian Ripley wrote:
>On Sat, 6 Jan 2007, gilbertg at musc.edu wrote:
>
>> I apologize for this post. I am new to R (two days) and I have tried and
>> tried to calculated confidence intervals for medians. Can someone help
>> me?
>
>Later, you say you want a confidence interval for a difference in medians,
>not the same thing.
>
>For medians, see MASS4 section 5.7 for worked examples and discussion of
>the pitfalls.
>
>> Here is my data:
>>
>> institution1
>> 0.21
>> 0.16
>> 0.32
>> 0.69
>> 1.15
>> 0.9
>> 0.87
>> 0.87
>> 0.73
>>
>> The first four observations compose group 1 and observations 5 through 9
>> compose group 2. I would like to create a bootstrapped 90% confidence
>> interval on the difference of the medians (n2-n1). I have successfully
>> calculated a permutation test.
>>
>> This shouldn't be as difficult as I am making it, would someone please
>> enlighten me?
>
>
>It seems to me to be much more difficult than you have made it.  We need
>to know exactly what you mean by
>
>> a bootstrapped 90% confidence interval on the difference of the medians
>
>The 'standard' theory of bootstrap confidence intervals as implemented in
>e.g. package 'boot' is for a single-sample problem (and it would be
>pushing its justification very hard to use this for n=9).  But you have
>two samples, and haven't told us how you intend to bootstrap.  I guess you
>mean a stratified bootstrap, sampling with replacement independently from
>observations 1-4 and 5-9.  I don't know of theory for bootstrap confidence
>intervals from that scenario: do you?
>
>Beyond this, there are considerable problems with bootstrapping medians in
>small samples as the median is a non-smooth function of the data and the
>bootstrap samples take very few values.  See for example the galaxies
>dataset as discussed in MASS4.  For the stratified bootstrapping I
>referred to, there are only a handful of possible values of each of the
>medians and so the bootstrap distribution is a highly non-uniform one on a
>few values.  E.g.
>
>x <- c(0.21, 0.16, 0.32, 0.69)
>y <- c(1.15, 0.9, 0.87, 0.87, 0.73)
>z <- numeric(10000)
>for(i in seq_len(10000))
>z[i] <- median(sample(x, replace=TRUE)) - median(sample(y, replace=TRUE))
>
>  -0.99 -0.965  -0.94  -0.91 -0.885  -0.83  -0.74 -0.725 -0.715  -0.71
>     33     70     83     27    134     64    129     16    259    317
>   -0.7  -0.69 -0.685  -0.66 -0.645 -0.635  -0.63 -0.605  -0.58  -0.57
>     43    370    711   1064     70    538    455   1388    424     29
>  -0.55 -0.545  -0.52  -0.49 -0.475 -0.465  -0.46  -0.45 -0.445  -0.42
>    905     57     79     41     54    119     28    183    146    436
>  -0.41 -0.395 -0.365 -0.305  -0.28 -0.225  -0.21  -0.18  -0.04
>    100    290    759     13     34     64    117    323     28
>
>You could use that table to give 'basic' or 'percentile' confidence
>intervals, if you have reason to believe in them.
>
>
>> Greg Gilbert, Faculty Research Associate
>> Department of Biostatistics, Bioinformatics, & Epidemiology
>> Medical University of South Carolina
>
>--
>Brian D. Ripley,                  ripley at stats.ox.ac.uk
>Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>University of Oxford,             Tel:  +44 1865 272861 (self)
>1 South Parks Road,                     +44 1865 272866 (PA)
>Oxford OX1 3TG, UK                Fax:  +44 1865 272595

```