[R] bootstrap data from groups

Ulrich Keller uhkeller at web.de
Wed Jun 7 20:55:11 CEST 2006


I am not sure I understand what you want to do, but maybe some of this 
will be helpful. I first generate some data that should resemble yours:

dat<-expand.grid(Region=1:3, Species=1:4, Sex=c("M","F"))
dat<-do.call("rbind",lapply(1:10,function(x) dat))
dat$Bodysize<-rnorm(nrow(dat),10,2)

Now what the following piece of code does is this: it samples 4 of the 
10 individuals in each of the 24 subsets (region*species*sex) and 
creates a new data frame with 96 cases. It then computes the mean of 
bodysize in each of the subsets. The whole thing is done 100 times, the 
results are put in a data frame. We end up with 100 bootstrapped means 
for the 24 subsets.

groupmeans<-sapply(1:100, function(z) {
  dat.rs<-do.call("rbind",
    lapply(split(dat,list(dat$Region,dat$Species,dat$Sex)),
      function(x) x[sample(10, 4, replace=TRUE),]))
  aggregate(dat.rs$Bodysize,
    list(dat.rs$Region,dat.rs$Species,dat.rs$Sex),
    mean)$x
  }
)
tmp<-aggregate(dat$Bodysize,
  list(dat$Region,dat$Species,dat$Sex),mean)
rownames(groupmeans)<-apply(tmp[,1:3],1,paste,collapse="")

Now we can compute the mean and sd of the means by group:

 > apply(groupmeans,1,mean)
      11M       21M       31M       12M       22M       32M       
13M       23M
 9.353095  9.267570  9.907933 10.992796  9.575841 10.412816  9.646964  
9.433724
      33M       14M       24M       34M       11F       21F       
31F       12F
10.750797  9.083630 10.573421  9.615743 10.267587 10.231126  9.329375 
10.799071
      22F       32F       13F       23F       33F       14F       
24F       34F
 9.355510 10.555705  9.919161 10.277103  9.335649  9.339544 10.023688  
9.755115
 > apply(groupmeans,1,sd)
      11M       21M       31M       12M       22M       32M       
13M       23M
0.7720758 1.5301540 1.0973516 0.8970237 1.0492995 0.9460970 0.5362957 
1.1106675
      33M       14M       24M       34M       11F       21F       
31F       12F
0.5333081 0.9259341 0.8198624 0.8061832 0.8466780 0.7052473 0.9857680 
1.1057607
      22F       32F       13F       23F       33F       14F       
24F       34F
0.8272433 1.2614559 1.2377154 1.0958545 0.9213648 0.9985215 1.1131870 
1.0572494

Milton Cezar schrieb:
> Hi R-friends.
>    
>   I have a mammalŽs dataset looking like:
>    
>        Region   Species Sex  Bodysize
>          1           Sp1      M      10.2
>          1           Sp1      M      12.1
>          1           Sp1      M       9.1
>         ...
>    
>   I have three regions, four species and the body size of 10 individual. IŽd like to do a bootstrap resample (100 resamples) of 4 of 10 individuals for each Region, Species and Sex and compute de means and S.D. for the combinations Regions-Species-Sex.
>    
>   How can I do that?
>    
>   Thanks a lot,
>    
>   Miltinho
>
>  __________________________________________________
>
>
> 	[[alternative HTML version deleted]]
>
>   
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



More information about the R-help mailing list