[R] merge coefficients from a glmlist of models

John Fox jfox at mcmaster.ca
Tue Oct 28 17:13:04 CET 2014


Hi Michael,

How about this?


coef.glmlist <- function(object, result=c("list", "matrix", "data.frame"),
...){
    result <- match.arg(result)
    coefs <- lapply(object, coef)
    if (result == "list") return(coefs)
    coef.names <- unique(unlist(lapply(coefs, names)))
    n.mods <- length(object)
    coef.matrix <- matrix(NA, length(coef.names), n.mods)
    rownames(coef.matrix) <- coef.names
    colnames(coef.matrix) <- names(object)
    for (i in 1:n.mods){
        coef <- coef(object[[i]])
        coef.matrix[names(coef), i] <- coef
    }
    if (result == "matrix") return(coef.matrix)
    as.data.frame(coef.matrix)
}

> coef(mods, result="data.frame")
                      donner.mod1 donner.mod2 donner.mod3 donner.mod4
(Intercept)            1.59915455  1.85514867   0.8792031   0.7621901
age                   -0.03379836 -0.04565225          NA          NA
sexMale               -1.20678665 -1.62177307  -1.3745016  -1.0995718
age:sexMale                    NA  0.01957257          NA          NA
poly(age, 2)1                  NA          NA  -7.9366059 -26.9688970
poly(age, 2)2                  NA          NA  -6.6929413 -30.5626032
poly(age, 2)1:sexMale          NA          NA          NA  22.7210591
poly(age, 2)2:sexMale          NA          NA          NA  28.8975876

Best,
 John

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Michael Friendly
> Sent: Tuesday, October 28, 2014 11:47 AM
> To: R-help
> Subject: [R] merge coefficients from a glmlist of models
> 
> In the vcdExtra package, I have a function glmlist to collect a set of
> glm() models as a "glmlist" object,
> and other functions that generate & fit such a collection of models.
> 
> This is my working example, fitting a set of models to the Donner data
> 
> # install.packages("vcdExtra", repos="http://R-Forge.R-project.org")#
> needs devel version
> data("Donner", package="vcdExtra")
> # make survived a factor
> Donner$survived <- factor(Donner$survived, labels=c("no", "yes"))
> Donner$family.size <- ave(as.numeric(Donner$family), Donner$family,
> FUN=length)
> # collapse small families into "Other"
> fam <- Donner$family
> levels(fam)[c(3,4,6,7,9)] <- "Other"
> # reorder, putting Other last
> fam = factor(fam,levels(fam)[c(1, 2, 4:6, 3)])
> Donner$family <- fam
> 
> # fit models
> donner.mod0 <- glm(survived ~ age, data=Donner, family=binomial)
> donner.mod1 <- glm(survived ~ age + sex, data=Donner, family=binomial)
> donner.mod2 <- glm(survived ~ age * sex , data=Donner, family=binomial)
> donner.mod3 <- glm(survived ~ poly(age,2) + sex, data=Donner,
> family=binomial)
> donner.mod4 <- glm(survived ~ poly(age,2) * sex, data=Donner,
> family=binomial)
> mods <- glmlist(donner.mod1, donner.mod2, donner.mod3, donner.mod4)
> 
> I'd like to write other methods for handling a glmlist, similar to the
> way stats::anova.glmlist works, e.g.,
> 
>  > library(vcdExtra)
>  > mods <- glmlist(donner.mod1, donner.mod2, donner.mod3, donner.mod4)
>  >
>  > anova(mods, test="Chisq")
> Analysis of Deviance Table
> 
> Model 1: survived ~ age + sex
> Model 2: survived ~ age * sex
> Model 3: survived ~ poly(age, 2) + sex
> Model 4: survived ~ poly(age, 2) * sex
> Resid. Df Resid. Dev Df Deviance Pr(>Chi)
> 1 87 111.128
> 2 86 110.727 1 0.4003 0.52692
> 3 86 106.731 0 3.9958
> 4 84 97.799 2 8.9321 0.01149 *
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> # gives same result as anova() with an explicit list of models:
> 
>  > anova(donner.mod1, donner.mod2, donner.mod3, donner.mod4,
> test="Chisq")
> Analysis of Deviance Table
> 
> Model 1: survived ~ age + sex
> Model 2: survived ~ age * sex
> Model 3: survived ~ poly(age, 2) + sex
> Model 4: survived ~ poly(age, 2) * sex
> Resid. Df Resid. Dev Df Deviance Pr(>Chi)
> 1 87 111.128
> 2 86 110.727 1 0.4003 0.52692
> 3 86 106.731 0 3.9958
> 4 84 97.799 2 8.9321 0.01149 *
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Then, using the function vcdExtra::Summarise, I can define a
> Summarise.glmlist method that is essentially
> 
> sumry <- lapply(mods, Summarise)
> do.call(rbind, sumry)
> 
>  > Summarise(mods)# not yet added to the package
> Likelihood summary table:
> AIC BIC LR Chisq Df Pr(>Chisq)
> donner.mod1 117.13 124.63 111.128 87 0.04159 *
> donner.mod2 118.73 128.73 110.727 86 0.03755 *
> donner.mod3 114.73 124.73 106.731 86 0.06439 .
> donner.mod4 109.80 124.80 97.799 84 0.14408
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Similarly, I can define a residuals.glmlist method, using using cbind()
> to collect the residuals from all
> models.
> But I'm stumped on a coef() method, because the coefficients fitted in
> various models
> differ.
> 
>  > coefs <- lapply(mods, "coef")
>  > coefs
> $donner.mod1
> (Intercept) age sexMale
> 1.59915455 -0.03379836 -1.20678665
> 
> $donner.mod2
> (Intercept) age sexMale age:sexMale
> 1.85514867 -0.04565225 -1.62177307 0.01957257
> 
> $donner.mod3
> (Intercept) poly(age, 2)1 poly(age, 2)2 sexMale
> 0.8792031 -7.9366059 -6.6929413 -1.3745016
> 
> $donner.mod4
> (Intercept) poly(age, 2)1 poly(age, 2)2 sexMale
> 0.7621901 -26.9688970 -30.5626032 -1.0995718
> poly(age, 2)1:sexMale poly(age, 2)2:sexMale
> 22.7210591 28.8975876
> 
> The result I want is a data.frame with columns corresponding to the
> models, and rows corresponding
> to the unique coefficient names, with NA filled in where a term is
> missing.
> 
> 
> --
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list