[R] boot() versus loop, and statistics option

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Feb 6 14:09:40 CET 2011


Package boot is support software for a book: have you consulted it?
It answers all your questions, and has copious examples.

On Sun, 6 Feb 2011, Sascha Vieweg wrote:

> Hello R users
>
> I am quite new to bootstrapping. Now, having some data x,
> ----
> R: set.seed(1234)
> R: x <- runif(300)
> ----
> I want to bootstrap simple statistics, mean and quantiles (.025, .975). 
> Currently, I run a loop
> ----
> R: res <- as.data.frame(matrix(ncol = 3, dimnames = list(NULL,
> ...    c("M", "Lo", "Hi"))))
> R: for (i in 1:100) {
> ...    y <- x[sample(1:length(x), length(x), repl = T)]
> ...    res[i, ] <- c(mean(y), quantile(y, c(0.025, 0.975)))
> ...}
> ----
> and then apply mean()
> ----
> R: apply(res, 2, mean)
>         M         Lo         Hi
> 0.49377715 0.03089873 0.98120235
> ----
> to get the indices of interest.
>
> I found the package 'boot' with the function of the same name. I tried to 
> replicate my tiny simulation using this code:
> ----
> R: library(boot)
> R: myfun <- function(x) {
> ...    return(c(mean(x), quantile(x, c(0.025, 0.975))))
> ...}
> R: boot(x, myfun, 100, sim = "parametric")
> PARAMETRIC BOOTSTRAP
>
> Call:
> boot(data = x, statistic = myfun, R = 100, sim = "parametric")
>
> Bootstrap Statistics :
>      original  bias    std. error
> t1* 0.48925194       0           0
> t2* 0.02806586       0           0
> t3* 0.98335435       0           0
> ----
> The outcome looks "quite" similar to what my loop returned, so that would be 
> fine. Yet, there is three things I don't understand:
>
> (1) I have to use the option 'sim="parametric"'. If I don't use this option 
> the function (provided via the statistic option) requires a second argument, 
> which -- according to '?boot' "will be a vector of indices, frequencies or 
> weights which define the bootstrap sample." What is that? Or is my simulation 
> simply parametric? Why?
>
> (2) What are the advantages and/or disadvantages of 'boot()' over my loop?
>
> (3) Can I in principle use 'boot()' to return all of the 100 different data 
> vectors used in the loop, or does 'boot()' by default return 
> already-calculated statistics?
>
> Thanks for hints and help, *S*
>
>
> -- 
> Sascha Vieweg, saschaview at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list