[R] bootstrap sample for clustered data

Liu, Lei lei@liu @ending from wu@tl@edu
Sun Sep 16 19:39:41 CEST 2018


Hi there,

I tried to generate bootstrap samples for clustered data. Here is some code I found in the web to do the work:

id=c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5)
y=c(.5, .6, .4, .3, .4, 1, .9, 1, .5, 2)
x=c(0, 0, 1, 1, 0, 0, 1, 1, 1, 1 )

xx=data.frame(id, x, y)

boot.cluster <- function(x, id){

  boot.id <- sample(unique(id), replace=T)
  out <- lapply(boot.id, function(i) x[id%in%i,])

  return( do.call("rbind",out) )

}

boot.pro=boot.cluster(xx, xx$id)

Now I have the output

   id x   y
5   3 0 0.4
6   3 0 1.0
51  3 0 0.4
61  3 0 1.0
9   5 1 0.5
10  5 1 2.0
52  3 0 0.4
62  3 0 1.0
3   2 1 0.4
4   2 1 0.3

However, the id variable is the original id, while I want to take the new id as (1, 1, 2, 2, 3, 3, 4, 4, 5, 5) for later analysis. Can anyone show me how to do it? Of note, the same original id may have duplicates since the bootstrap sample is drawn with replacement. Thanks a lot!

Lei


	[[alternative HTML version deleted]]



More information about the R-help mailing list