[R] Help using mapply to run multiple models

Bert Gunter gunter.berton at gene.com
Wed Dec 18 18:35:51 CET 2013


Folks:

1. Haven't closely followed the thread. I'm responding only to Simon's post.

2. ?formula ## Especially note the use of "."

So just make an appropriately constructed data frame for the data
argument of glm:

## example

> df <- data.frame(y=rnorm(9),x1=runif(9), x2=1:9)
> glm(y~.,data=df)
## y does not need to be in the data frame.

Another way to handle the OP is via substitute() or bquote(). I'll
leave that to thers to explain.

-- Bert




On Wed, Dec 18, 2013 at 9:11 AM, Simon Kiss <sjkiss at gmail.com> wrote:
> Dennis, how would your function be modified to allow it to be more flexible in future.
> I'm thinking like:
>> f <- function(x='Dependent variable', y='List of Independent Variables', z='Data Frame')
>> {
>>    form <- as.formula(paste(x, y, sep = " ~ "))
>>    glm(form, data =z)
>> }
>
> I tried that then using
> modlist <- lapply(xvars, f), but it didn't work.
>
> On 2013-12-18, at 3:29 AM, Dennis Murphy <djmuser at gmail.com> wrote:
>
>> Hi:
>>
>> Here's a way to generate a list of model objects. Once you have the
>> list, you can write or call functions to extract useful pieces of
>> information from each model object and use lapply() to call each list
>> component recursively.
>>
>> sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5),
>>                      var2=rbinom(50, size=2, prob=0.5),
>>                      var3=rbinom(50, size=3, prob=0.5),
>>                      var4=rbinom(50, size=2, prob=0.5),
>>                      var5=rbinom(50, size=2, prob=0.5))
>>
>> # vector of x-variable names
>> xvars <- names(sample.df)[-1]
>>
>> # function to paste a variable x into a formula object and
>> # then pass it to glm()
>> f <- function(x)
>> {
>>    form <- as.formula(paste("var1", x, sep = " ~ "))
>>    glm(form, data = sample.df)
>> }
>>
>> # Apply the function f to each variable in xvars
>> modlist <- lapply(xvars, f)
>>
>> To give you an idea of some of the things you can do with the list:
>>
>> sapply(modlist, class)        # return class of each component
>> lapply(modlist, summary)   # return the summary of each model
>>
>> # combine the model coefficients into a two-column matrix
>> do.call(rbind, lapply(modlist, coef))
>>
>>
>> You'd probably want to rename the second column since the slopes are
>> associated with different x variables.
>>
>> Dennis
>>
>> On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss <sjkiss at gmail.com> wrote:
>>> I think I'm missing something.  I have a data frame that looks below.
>>> sample.df<-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5))
>>>
>>> I'd like to run a series of univariate general linear models where var1 is always the dependent variable and each of the other variables is the independent. Then I'd like to summarize each in a table.
>>> I've tried :
>>>
>>> sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5)
>>> mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial')
>>>
>>> And that works pretty well, except, I'm left with a matrix that contains all the information I need. I can't figure out how to use summary() properly on this information to usefully report that information.
>>>
>>> Thank you for any suggestions.
>>>
>>> *********************************
>>> Simon J. Kiss, PhD
>>> Assistant Professor, Wilfrid Laurier University
>>> 73 George Street
>>> Brantford, Ontario, Canada
>>> N3T 2C9
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> *********************************
> Simon J. Kiss, PhD
> Assistant Professor, Wilfrid Laurier University
> 73 George Street
> Brantford, Ontario, Canada
> N3T 2C9
> Cell: +1 905 746 7606
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374



More information about the R-help mailing list