[R] bootstrap resampling - simplified

Jonathan P Daily jdaily at usgs.gov
Wed Mar 2 19:32:18 CET 2011


I will point out again that sampling a five-fold replicate of 1:20 is not 
the same as resampling with replacement, although I made an error in 
reporting probabilities - the P(x2 = 1 | x1 = 1) = 4/99 and not 4/100. 
When sampling with replacement, P(x2 = 1 | x1 = 1) = P(x2 = 1 | x1 != 1) = 
1/20.
--------------------------------------
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
     - Jubal Early, Firefly

r-help-bounces at r-project.org wrote on 03/02/2011 01:05:01 PM:

> [image removed] 
> 
> Re: [R] bootstrap resampling - simplified
> 
> Vokey, John 
> 
> to:
> 
> r-help
> 
> 03/02/2011 01:07 PM
> 
> Sent by:
> 
> r-help-bounces at r-project.org
> 
> On 2011-03-02, at 4:00 AM, r-help-request at r-project.org wrote:
> 
> > Hello there,
> > 
> > I have a problem concerning bootstrapping in R - especially 
> focusing on the resampling part of it. I try to sum it up in a 
> simplified way so that I would not confuse anybody.
> > 
> > I have a small database consisting of 20 observations (basically 
> numbers from 1 to 20, I mean: 1, 2, 3, 4, 5, ... 18, 19, 20).
> > 
> > I would like to resample this database many times for the 
> bootstrap process with the following conditions. Firstly, every 
> resampled database should also include 20 observations. Secondly, 
> when selecting a number from the above-mentioned 20 numbers, you can
> do this selection with replacement. The difficult part comes now: 
> one number can be selected only maximum 5 times. In order to make 
> this clear I show you a couple of examples. So the resampled 
> databases might be like the following ones:
> > 
> > (1st database)          1,2,1,2,1,2,1,2,1,2,3,3,3,3,3,4,4,4,4,4
> > 4 different numbers are chosen (1, 2, 3, 4), each selected - for 
> the maximum possible - 5 times.
> > 
> > (2nd database)          1,8,8,6,8,8,8,2,3,4,5,6,6,6,6,7,19,1,1,1
> > Two numbers - 8 and 6 - selected 5 times (the maximum possible 
> times), number 1 selected 4 times, the others selected less than 4 
times.
> > 
> > (3rd database)          1,1,2,2,3,3,4,4,9,9,9,10,10,13,10,9,3,9,2,1
> > Number 9 chosen for the maximum possible 5 times, number 10, 3, 2,
> 1 chosen for 3 times, number 4 selected twice and number 13 selectedonly 
once.
> > 
> > ...
> > 
> > Anybody knows how to implement my "tricky" condition into one of 
> the R functions - that one number can be selected only 5 times at 
> most? Are 'boot' and 'bootstrap' packages capable of managing this? 
> I guess they are, I just couldn't figure it out yet...
> > 
> > Thanks very much! Best regards,
> > Laszlo Bodnar
> 
> Laszlo,
>   Create a vector consisting of 5 of each number.  Then, for each 
> sample, scramble the order of the items in the vector, and select 
> the first 20.
> 
> 
> --
> Please avoid sending me Word or PowerPoint attachments.
> See <http://www.gnu.org/philosophy/no-word-attachments.html>
> 
> -Dr. John R. Vokey
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list