[R] Running "all possible subsets" of a GLM (binomial) model

hadley wickham h.wickham at gmail.com
Mon Jul 7 19:04:02 CEST 2008


On Mon, Jul 7, 2008 at 10:18 AM, Eric Vander Wal <ejvander at lakeheadu.ca> wrote:
> I have spent a fair amount of time looking for a package that is automated
> to run glm (binomial) regression models with all possible subsets of my
> independent variables.  Something akin to Lumley's "leaps" package, but can
> be applied to glms, not just lms; or something similar to Stata's brute
> force "tryem" function?  If anyone can point me in the right direction I
> would really appreciate it.

Have a look at fitall in the meifly package:

fitall <- function(y, x, method=lm, ...) {
	data <- cbind(y=y, x)

	combs <- do.call(expand.grid, rep(list(c(FALSE, TRUE)), ncol(x)))[-1, ]

	vars <- apply(combs, 1, function(i) names(x)[i])
	form <- paste("y ~ ", lapply(vars, paste, collapse=" + "), sep = "")
	form <- lapply(form, as.formula)

	models <- lapply(form, function(f) eval(substitute(method(f,
data=data, ...), list(f=f, data=data, method=method))))
	names(models) <- 1:length(models)
	class(models) <- c("ensemble", class(models))
	models
}

That should get you started -  the meifly package also contains a few
functions for summarising and visualising these ensembles of models.
See http://had.co.nz/meifly/ for a little more detail, and a paper
using meifly for a simple case study.

Hadley


-- 
http://had.co.nz/



More information about the R-help mailing list