[R] Strange results : bootrstrp CIs

Rolf Turner ro||turner @end|ng |rom po@teo@net
Sun Jan 14 03:33:49 CET 2024


On Sat, 13 Jan 2024 16:54:01 -0800
Bert Gunter <bgunter.4567 using gmail.com> wrote:

> Well, this would seem to work:
> 
> e <- data.frame(Score = Score
>              , Country = factor(Country)
>              , Time = Time)
> 
> ncountry <- nlevels(e$Country)
> func= function(dat,idx) {
>    if(length(unique(dat[idx,'Country'])) < ncountry) NA
>    else coef(lm(Score~ Time + Country,data = dat[idx,]))
> }
> B <-  boot(e, func, R=1000)
> 
> boot.ci(B, index=2, type="perc")
> 
> Caveats:
> 1) boot.ci handles the NA's by omitting them, which of course gives a
> smaller resample and longer CI's than the value of R specified in the
> call to boot().
> 
> 2) I do not know if the *nice* statistical properties of the
> nonparametric bootstrap, e.g. asymptotic correctness, hold when
> bootstrap samples are produced in this way.  I leave that to wiser
> heads than me.

<SNIP>

It seems to me that my shaganappi idea causes func() to return a vector
of coefficients with NAs corresponding to any missing levels of the
"Country" factor, whereas your idea causes it to return a scalar NA
whenever one or more of the levels of the "Country" factor is missing.

I have no idea what the implications of this are.  As I said before, I
have no idea what I am doing!

cheers,

Rolf

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Stats. Dep't. (secretaries) phone:
         +64-9-373-7599 ext. 89622
Home phone: +64-9-480-4619



More information about the R-help mailing list