[R] Help using mapply to run multiple models

William Dunlap wdunlap at tibco.com
Wed Dec 18 21:07:18 CET 2013


Try something like the following.  Because lm() evaluates many
of its arguments in nonstandard ways, f() manipulates the call
and then evaluates it in the frame from which f() was called.
It also puts that environment on the formula that it creates so
it can refer to variables in that environment.
    f <- function (responseName, predictorNames, data, ..., envir = parent.frame())
    {
        call <- match.call()
        call$formula <- formula(envir = envir, paste(responseName, sep = " ~ ",
            paste0("`", predictorNames, "`", collapse = " + ")))
                call[[1]] <- quote(glm) # 'f' -> 'glm'
        call$responseName <- NULL # omit responseName=
        call$predictorNames <- NULL # omit 'predictorNames='
                eval(call, envir = envir)
    }
as in
    z <- lapply(list(c("hp","drat"), c("cyl"), c("am","gear")), FUN=function(preds)f("carb", preds, data=mtcars, family=poisson))
    lapply(z, summary)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Simon Kiss
> Sent: Wednesday, December 18, 2013 9:11 AM
> To: Dennis Murphy
> Cc: r-help at r-project.org
> Subject: Re: [R] Help using mapply to run multiple models
> 
> Dennis, how would your function be modified to allow it to be more flexible in future.
> I'm thinking like:
> > f <- function(x='Dependent variable', y='List of Independent Variables', z='Data Frame')
> > {
> >    form <- as.formula(paste(x, y, sep = " ~ "))
> >    glm(form, data =z)
> > }
> 
> I tried that then using
> modlist <- lapply(xvars, f), but it didn't work.
> 
> On 2013-12-18, at 3:29 AM, Dennis Murphy <djmuser at gmail.com> wrote:
> 
> > Hi:
> >
> > Here's a way to generate a list of model objects. Once you have the
> > list, you can write or call functions to extract useful pieces of
> > information from each model object and use lapply() to call each list
> > component recursively.
> >
> > sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5),
> >                      var2=rbinom(50, size=2, prob=0.5),
> >                      var3=rbinom(50, size=3, prob=0.5),
> >                      var4=rbinom(50, size=2, prob=0.5),
> >                      var5=rbinom(50, size=2, prob=0.5))
> >
> > # vector of x-variable names
> > xvars <- names(sample.df)[-1]
> >
> > # function to paste a variable x into a formula object and
> > # then pass it to glm()
> > f <- function(x)
> > {
> >    form <- as.formula(paste("var1", x, sep = " ~ "))
> >    glm(form, data = sample.df)
> > }
> >
> > # Apply the function f to each variable in xvars
> > modlist <- lapply(xvars, f)
> >
> > To give you an idea of some of the things you can do with the list:
> >
> > sapply(modlist, class)        # return class of each component
> > lapply(modlist, summary)   # return the summary of each model
> >
> > # combine the model coefficients into a two-column matrix
> > do.call(rbind, lapply(modlist, coef))
> >
> >
> > You'd probably want to rename the second column since the slopes are
> > associated with different x variables.
> >
> > Dennis
> >
> > On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss <sjkiss at gmail.com> wrote:
> >> I think I'm missing something.  I have a data frame that looks below.
> >> sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, size=2,
> prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, prob=0.5),
> var5=rbinom(50, size=2, prob=0.5))
> >>
> >> I'd like to run a series of univariate general linear models where var1 is always the
> dependent variable and each of the other variables is the independent. Then I'd like to
> summarize each in a table.
> >> I've tried :
> >>
> >> sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5)
> >> mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial')
> >>
> >> And that works pretty well, except, I'm left with a matrix that contains all the
> information I need. I can't figure out how to use summary() properly on this information
> to usefully report that information.
> >>
> >> Thank you for any suggestions.
> >>
> >> *********************************
> >> Simon J. Kiss, PhD
> >> Assistant Professor, Wilfrid Laurier University
> >> 73 George Street
> >> Brantford, Ontario, Canada
> >> N3T 2C9
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> 
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 905 746 7606
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list