[R] boot with strata: strata argument ignored?
Charles C. Berry
cberry at tajo.ucsd.edu
Sat Jun 26 18:43:40 CEST 2010
On Sat, 26 Jun 2010, Bryan Hanson wrote:
> Hello All. I must be missing the really obvious here:
>
> mm <- function(d, i) median(d[i])
> b1 <- boot(gravity$g, mm, R = 1000)
> b1
> b2 <- boot(gravity$g, mm, R = 1000, strata = gravity$series)
> b2
>
> Both b1 and b2 seem to have done (almost) the same thing, but it looks like
> the strata argument in b2 has been ignored. However, str(b1) vs str(b2)
> does show that the strata have been noted correctly. But b2$t is a 1000 x 1
> array, not a 1000 x 8 array (gravity$series is a factor with 8 levels).
>
> There is a more complex example in ?boot using the same data set that gives
> a result that seems to make sense (2 levels in the factor, so $t has 2
> columns).
>
> I either misunderstand the expected behavior or I've missed some punctuation
> or syntax detail.
Your punctuation and syntax is OK.
Note:
> SISWR <- function(x) sample(x,length(x),repl=TRUE)
> # no strata
> var(replicate(1000,median(SISWR(gravity$g))))
[1] 0.4588338
> # now stratify on series
> gsplit <- split(gravity$g,gravity$series)
> var(replicate(1000,median(unlist(lapply(gsplit,SISWR)))))
[1] 0.3882272
>
> sqrt(.45) # this agrees with b1
[1] 0.6708204
> sqrt(.39) # this agrees with b2
[1] 0.6244998
>
The effect of stratification depends on the relative amount of variation
within vs between strata. This suggests there is not a lot:
> aov(g~series,gravity)
Call:
aov(formula = g ~ series, data = gravity)
Terms:
series Residuals
Sum of Squares 2818.624 8239.376
Deg. of Freedom 7 73
Residual standard error: 10.62394
Estimated effects may be unbalanced
>
HTH,
Chuck
>
> TIA, Bryan
>
> *************
> Bryan Hanson
> Acting Chair
> Professor of Chemistry & Biochemistry
> DePauw University, Greencastle IN USA
>
>> sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-apple-darwin9.8.0
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] datasets tools grid graphics grDevices utils stats
> [8] methods base
>
> other attached packages:
> [1] boot_1.2-42 brew_1.0-3 faraway_1.0.4
> [4] GGally_0.2 xtable_1.5-6 mvbutils_2.5.1
> [7] ggplot2_0.8.7 digest_0.4.2 reshape_0.8.3
> [10] proto_0.3-8 ChemoSpec_1.43 R.utils_1.4.0
> [13] R.oo_1.7.2 R.methodsS3_1.2.0 rgl_0.91
> [16] lattice_0.18-5 mvoutlier_1.4 plyr_0.1.9
> [19] RColorBrewer_1.0-2 chemometrics_0.8 som_0.3-5
> [22] robustbase_0.5-0-1 rpart_3.1-46 pls_2.1-0
> [25] pcaPP_1.8-1 mvtnorm_0.9-9 nnet_7.3-1
> [28] mclust_3.4.4 MASS_7.3-5 lars_0.9-7
> [31] e1071_1.5-23 class_7.3-2
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list