[R] bootstrap glm

Adams, Jean jvadams at usgs.gov
Mon Feb 29 23:01:12 CET 2016


​It's helpful if you post example data with your question, so R-help
readers can easily test your code​.  I created a fake data set with two x
variables for testing.  It's also helpful if you list the necessary
packages.  I assume that used the boot package.

You are dealing with two kinds of indices here, and I think you are
confusing them a bit in your code.  One index tells the program which x
columns to fit, another index tells the boot strapping part of the program
which rows to use.  I modified your code to do what I think you are trying
to do.  The results seem reasonable in this example.

Jean

# fake data
n <- 200
datasim <- data.frame(Y=sample(0:1, n, replace=TRUE), X1=rnorm(n),
X2=rnorm(n))

library(boot)

set.seed(111)

yfunction <- function(data, indices) {
  glm.snp1 <- glm(Y ~ ., family="binomial", data=data[indices, ])
  null <- glm.snp1$null.deviance
  residual <- glm.snp1$deviance
  return(null-residual)
}

resulty <- lapply(2:(ncol(datasim)), function(xcol) {
  boot(datasim[, c(1, xcol)], yfunction, R=100)
})

obsresult <- sapply(resulty, "[[", "t0")
bootresult <- sapply(resulty, "[[", "t")

# observed result
obsresult

# naive 95% bootstrap CI
apply(bootresult, 2, quantile, c(0.025, 0.975))


On Thu, Feb 25, 2016 at 9:59 AM, Hassan, Nazatulshima <
Nazatulshima.Hassan at liverpool.ac.uk> wrote:

> Hi
>
> I have a data with an outcome,Y and 10 predictors (X1-X10).
> My aim is to fit a logistic model to each of the predictors and calculate
> the deviance difference (dDeviance).
> And later on bootstrapping the dDeviance for 100 times (R=100).
> I tried the following function. It is calculating the original dDeviance
> correctly. But, when I checked the mean bootstrap values, it differs
> greatly from the original.
> I suspect I made a mistake with the bootstrapping function, which I need
> help with.
> I attached the script if you need to look at it.
>
> Thank you in advance.
>
>
> set.seed(111)
>
> yfunction <- function(data,indices)
> {
> glm.snp1 <- glm(Y~data[indices], family="binomial", data=datasim)
> null <- glm.snp1$null.deviance
> residual <- glm.snp1$deviance
> dDeviance <-(null-residual)
> return(dDeviance)
> }
>
> mybootstrap <- function(data)
> {
> boot(data,yfunction, R=100)
> }
>
> resulty <- lapply(datasim[,-1],function(x)mybootstrap(x))
> bootresult <- sapply(datasim[,-1],function(x)mybootstrap(x)$t)
> colMeans(bootresult)
>
>
>
> -shima-
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list