[R] Bootstrap CIs for weighted means of paired differences

Thu Nov 20 20:19:25 CET 2014

On Nov 20, 2014, at 2:23 AM, i.petzev wrote:

> Hi David,
> 
> sorry, I was not clear.

Right. You never were clear about what you wanted and your examples was so statistically symmetric that it is still hard to see what is needed. The examples below show CI's that are arguably equivalent. I can be faulted for attempting to provide code that produced a sensible answer to a vague question to which I was only guessing at the intent.

> The difference comes from defining or not defining “w” in the boot() function. The results with your function and your approach are thus:
> 
> set.seed(1111)
> x <- rnorm(50)
> y <- rnorm(50)
> weights <- runif(50)
> weights <- weights / sum(weights)
> dataset <- cbind(x,y,weights)
> 
> vw_m_diff <- function(dataset,w) {
>   differences <- dataset[w,1]-dataset[w,2]
>   weights <- dataset[w, "weights"]
>   return(weighted.mean(x=differences, w=weights))
> }
> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3])
> boot.ci(res_boot)
> 
> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
> Based on 1000 bootstrap replicates
> 
> CALL : 
> boot.ci(boot.out = res_boot)
> 
> Intervals : 
> Level      Normal              Basic         
> 95%   (-0.5657,  0.4962 )   (-0.5713,  0.5062 )  
> 
> Level     Percentile            BCa          
> 95%   (-0.6527,  0.4249 )   (-0.5579,  0.5023 )  
> Calculations and Intervals on Original Scale
> 
> ********************************************************************************************************************
> 
> However, without defining “w” in the bootstrap function, i.e., running an ordinary and not a weighted bootstrap, the results are:
> 
> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000)
> boot.ci(res_boot)
> 
> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
> Based on 1000 bootstrap replicates
> 
> CALL : 
> boot.ci(boot.out = res_boot)
> 
> Intervals : 
> Level      Normal              Basic         
> 95%   (-0.6265,  0.4966 )   (-0.6125,  0.5249 )  

I hope you are not saying that because those CI's are different that there is some meaning in that difference. Bootstrap runs will always be "different" than each other unless you use set.seed(.) before the runs.

> 
> Level     Percentile            BCa          
> 95%   (-0.6714,  0.4661 )   (-0.6747,  0.4559 )  
> Calculations and Intervals on Original Scale
> 
> On 19 Nov 2014, at 17:49, David Winsemius <dwinsemius at comcast.net> wrote:
> 
>>>> vw_m_diff <- function(dataset,w) {
>>>>     differences <- dataset[w,1]-dataset[w,2]
>>>>    weights <- dataset[w, "weights"]
>>>>    return(weighted.mean(x=differences, w=weights))
>>>>  }
> 

David Winsemius
Alameda, CA, USA