[R] bootstrap sample for clustered data

Sun Sep 16 21:20:45 CEST 2018

I can't make any sense of your post. Id 3 occurs 6 times, and 2 and 5 occur
twice each in your example.. How do you get (1,1,2,2,3,3,4,4,5,5) out of
that? In other words, specify the mapping of old id's to new.

Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sun, Sep 16, 2018 at 11:51 AM Liu, Lei <lei.liu using wustl.edu> wrote:

> Hi there,
>
> I tried to generate bootstrap samples for clustered data. Here is some
> code I found in the web to do the work:
>
> id=c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)
> y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2)
> x=c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1 )
>
> xx=data.frame(id, x, y)
>
> boot.cluster <- function(x, id){
>
>   boot.id <- sample(unique(id), replace=T)
>   out <- lapply(boot.id, function(i) x[id%in%i,])
>
>   return( do.call("rbind",out) )
>
> }
>
> boot.pro=boot.cluster(xx, xx$id)
>
> Now I have the output
>
>    id x   y
> 5   3 0 0.4
> 6   3 0 1.0
> 51  3 0 0.4
> 61  3 0 1.0
> 9   5 1 0.5
> 10  5 1 2.0
> 52  3 0 0.4
> 62  3 0 1.0
> 3   2 1 0.4
> 4   2 1 0.3
>
> However, the id variable is the original id, while I want to take the new
> id as (1, 1, 2, 2, 3, 3, 4, 4, 5, 5) for later analysis. Can anyone show me
> how to do it? Of note, the same original id may have duplicates since the
> bootstrap sample is drawn with replacement. Thanks a lot!
>
> Lei
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]