[R] boot() with glm/gnm on a contingency table

Tim Hesterberg timhesterberg at gmail.com
Wed Sep 12 16:08:49 CEST 2012


One approach is to bootstrap the vector 1:n, where n is the number
of individuals, with a function that does:
f <- function(vectorOfIndices, theTable) {
  (1) create a new table with the same dimensions, but with the counts
  in the table based on vectorOfIndices.
  (2) Calculate the statistics of interest on the new table.
}

When f is called with 1:n, the table it creates should be the same
as the original table.  When called with a bootstrap sample of
values from 1:n, it should create a table corresponding to the
bootstrap sample.

Tim Hesterberg
http://www.timhesterberg.net
 (resampling, water bottle rockets, computers to Costa Rica, shower = 2650 light bulbs, ...)

NEW!  Mathematical Statistics with Resampling and R, Chihara & Hesterberg
http://www.amazon.com/Mathematical-Statistics-Resampling-Laura-Chihara/dp/1118029852/ref=sr_1_1?ie=UTF8

>Hi everyone!
>
>In a package I'm developing, I have created a custom function to get
>jackknife standard errors for the parameters of a gnm model (which is
>essentially the same as a glm model for this issue). I'd like to add
>support for bootstrap using package boot, but I couldn't find how to
>proceed.
>
>The problem is, my data is a table object. Thus, I don't have one
>individual per line: when the object is converted to a data frame, one
>row represents one cell, or one combination of factor levels. I cannot
>pass this to boot() as the "data" argument and use "indices" from my
>custom statistic() function, since I would drop cells, not individual
>observations.
>
>A very inefficient solution would be to create a data frame with one row
>per observation, by replicating each cell using its frequencies. That's
>really a brute force solution, though. ;-)
>
>The other way would be generate importance weights based on observed
>frequencies, and to multiply the original data by the weights at each
>iteration, but I'm not sure that's correct. Thoughts?
>
>
>Thanks for your help!




More information about the R-help mailing list