[R] Bootstrap analysis from a conditional logistic regression

Mon Nov 13 23:55:11 CET 2017

> On Nov 13, 2017, at 2:01 PM, Nelly Reduan <nell.redu at hotmail.fr> wrote:
> 
> Nelly Reduan a partag� un fichier OneDrive avec vous. Pour l�afficher, cliquez sur le lien ci-dessous.
> 
> 
> <https://1drv.ms/u/s!Apkg2VlgfYyDgRAeVIM0nEajx0Fb>
> [https://r1.res.office365.com/owa/prem/images/dc-png_20.png]<https://1drv.ms/u/s!Apkg2VlgfYyDgRAeVIM0nEajx0Fb>
> 
> Screenshot 2017-11-12 18.49.43.png<https://1drv.ms/u/s!Apkg2VlgfYyDgRAeVIM0nEajx0Fb>
> 
> 
> 
> 
> Hello
> 
> How can I perform a bootstrap analysis from a conditional logistic regression? The model has been built using the `clogit` function (`survival` package)? The model has the following structure:
> 
>    mod <- clogit(event ~ forest + log_area +forest:log_time  + cluster(ID_individual)  +   strata(ID_strata), method = "efron", data = data , x=T, y=T)
> 
> Using bootstrapping, I would like to have a measure of uncertainty around the estimates of beta coefficients.
> 
> I am using the following code but I don't know how to consider strata and cluster arguments.
> 
>    library(boot)
>    boot.clogit <- function(data, indices){
>      new_data <- data[indices,]
>      mod <- clogit(event ~ forest + log_area + forest:log_time  + cluster(ID_individual)  +  strata(ID_strata),
>                    method = "efron", data = new_data, x=T, y=T)
>      coefficients(mod)
>    }
> 
>    boot_data <- boot(data=data, statistic=boot.clogit, R=5000)
> 
> I have attached an overview of my data set.

You probably tried to attach something but you failed to note the section in the listinfo or posting guide where the list owners describe the rules for attachments. I think you would need to describe the sampling design more thoroughly. A simple description of the data layout may not be sufficient.

 The fact that you are clustering on individuals suggests you have some sort of repeated measures design and that you have somehow matched the individual to controls in some unstated ratio (handled by the strata. (Admittedly all guesswork and the more knowledgeable respondents (among which I'm not likely to reside)  are often hesitant to contribute substantive commentary unless they can narrow down range of possible design issues. I read Davison and Hinkley as suggesting that sampling by group but then keeping sampled groups undisturbed may have better chance of resulting in estimates of variances that match the superpopulation. See pages 100-102 of their book.

If my reading of that section is correct then I should think you would arrange you data so groups are in the long direction and single groups occupy a line of data with a single index. Then you would probably rearrange the data within the boot.clogit function so that the "inner" clogit call can handle it correctly.

-- 
David.
> 
> Thank you very much for your time.
> Best regards,
> Nell
> 
> 
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

'Any technology distinguishable from magic is insufficiently advanced.'   -Gehm's Corollary to Clarke's Third Law