[R] converting a list of loglin terms to a model formula

William Dunlap wdunlap at tibco.com
Fri Jul 5 22:41:26 CEST 2013


> Call:
> loglm(formula = form, data = x)       ### I want formula = ~Hair:Eye + Sex here

Since your function made the call
   loglm(form, data=x)
the 'call' component of output is going to show 'form', not '~ Hair:Eye + Sex'.
You can use bquote to pre-evaluate the formula=form argument to get the call
to look nicer, as in:
  form <- mpg ~ wt + hp
  eval(bquote(lm(.(form), data=mtcars)))
instead of
   lm(form, data=mtcars)

In your loglin2formula, I would make by environment of the generated formula
the environment of the caller of loglin2formula (or somewhere else if the user wishes)
by adding the argument
   env = parent.frame()
and replacing
   as.formula( sprintf(...) )
with
   formula( sprintf(...), env=env)
(I don't think you've run into any problems related to having an irrelevant
environment attached to the formula, but they will happen if the formula
involves any variable names that happen to be in loglin2formula.)

I would also change the loglin2formula so it worked with non-syntactic names.
Wrapping them with backquotes would probably do it, but I may have missed
something in the back and forth between character strings and formula.

As for loglin2string, you complain that it works when given a list of character
vectors but not when is given a formula.  That is not surprising.  Did you mean
for test_loglm to pass it 'margins' instead of 'form'?

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of Michael Friendly
> Sent: Friday, July 05, 2013 12:21 PM
> To: Henrique Dallazuanna
> Cc: R-help
> Subject: Re: [R] converting a list of loglin terms to a model formula
> 
> 
> On 7/3/2013 6:38 PM, Henrique Dallazuanna wrote:
> > Try this:
> >
> >  as.formula(sprintf(" ~ %s", do.call(paste, c(lapply(mutual(3), paste,
> > collapse = ":"), sep = "+"))))
> >
> Thanks for this.  I encapsulated this as a function, loglin2formula()
> and a related function,
> loglin2string() to give a character string representation.
> They seem to work when used from the command line, but don't give
> what I need when I use it in another function.  I think it has something
> to do with the environment
> for the formula or how I pass it to MASS::loglm in my function
> test_loglm (below).
> 
> I'll demonstrate the problem first, then give the functions & test cases
> I'm using as
> runnable code.
> 
>  > joint(3, table=HairEyeColor)
> $term1
> [1] "Hair" "Eye"
> 
> $term2
> [1] "Sex"
> 
>  > loglin2formula(joint(3, table=HairEyeColor))
> ~Hair:Eye + Sex
> <environment: 0x0884a640>
>  > loglin2string(joint(3, table=HairEyeColor))
> [1] "[Hair,Eye] [Sex]"
> 
> These look like what I want, more or less; but, when I use these in the
> function test_loglm
> (also below) , the formula doesn't get resolved when I call loglm (it
> appears as
> formula = form), and the result of loglin2string gets garbled. The
> numerical results are
> correct; I'm concerned about the labeling in the computed loglm object.
> 
>  > test_loglm(HairEyeColor, type='joint')
> formula:
> ~Hair:Eye + Sex
> <environment: 0x08842de0>
> model.string:
> [1] "[~] [+,Hair:Eye,Sex]"                   ### I want "[Hair,Eye]
> [Sex]" here
> model:
> Call:
> loglm(formula = form, data = x)       ### I want formula = ~Hair:Eye +
> Sex here
> 
> Statistics:
>                        X^2 df  P(> X^2)
> Likelihood Ratio 19.85656 15 0.1775045
> Pearson          19.56712 15 0.1891745
>  >
> 
> ## --- functions and test cases, should be runnable (mod line folding)
> --- ###
> 
> #' convert a loglin model to a model formula for loglm
> 
> #' @param  x a list of terms in a loglinear model, such as returned by
> \code{joint}, \code{conditional}, \dots
> #' @source Code from Henrique Dallazuanna, <wwwhsd at gmail.com>, R-help
> 7-4-2013
> 
> loglin2formula <- function(x) {
>      terms <- lapply(x, paste, collapse = ":")
>      as.formula(sprintf(" ~ %s", do.call(paste, c(terms, sep = "+"))))
> }
> 
> #' convert a loglin model to a string, using bracket notation for the
> high-order terms
> 
> #' @param x a list of terms in a loglinear model, such as returned by
> \code{joint}, \code{conditional}, \dots
> #' @param brackets characters to use to surround model terms.
> #' @param sep characters used to separate factor names within a term
> #' @param collapse  characters used to separate terms
> #' @param abbrev
> 
> loglin2string <- function(x, brackets = c('[', ']'), sep=',', collapse='
> ', abbrev) {
>      if (length(brackets)==1 && (nchar(brackets)>1)) brackets <-
> unlist(strsplit(brackets, ""))
>      terms <- lapply(x, paste, collapse=sep)
>      terms <- paste(brackets[1], terms, brackets[2], sep='')
>      paste(terms, collapse= ' ')
> }
> 
> #' models of joint independence, of some factors wrt one or more other
> factors
> 
> #' @param nf number of factors for which to generate model
> #' @param table a contingency table used for factor names, typically the
> output from \code{\link[base]{table}}
> #' @param factors names of factors used in the model when \code{table}
> is not specified
> #' @param with    indices of the factors against which others are
> considered jointly independent
> #' @export
> 
> joint <- function(nf, table=NULL, factors=1:nf, with=nf) {
>      if (!is.null(table)) factors <- names(dimnames(table))
>      if (nf == 1) return (list(term1=factors[1]))
>      if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
>      others <- setdiff(1:nf, with)
>      result <- list(term1=factors[others], term2=factors[with])
>      result
>    }
> 
> ### Test using these in another function
> 
> test_loglm <- function(
>      x, type = c("joint", "conditional"),
>      ...)
> {
>      nf <- length(dim(x))
>      factors <- names(dimnames(x))
>      type <- match.arg(type)
>      margins <- switch(type,
>        "joint" = joint(nf, factors=factors),
>        "conditional" = conditional(nf, factors=factors)
>        )
> 
>    form <- loglin2formula(margins);
>    cat("formula:\n"); print(form)
>    model.string <- loglin2string(form)
>    cat("model.string:\n"); print(model.string)
>    mod <- loglm(formula=form, data=x)
>    cat("model:\n"); print(mod)
>    invisible(list(form=form, model.string=model.string, mod=mod))
> }
> 
> ## tests
> 
> loglin2formula(joint(3, table=HairEyeColor))
> loglin2string(joint(3, table=HairEyeColor))
> test_loglm(HairEyeColor, type='joint')
> 
> 
> >
> >
> > On Wed, Jul 3, 2013 at 6:55 PM, Michael Friendly <friendly at yorku.ca
> > <mailto:friendly at yorku.ca>> wrote:
> >
> >     I'm developing some functions to create symbolic specifications
> >     for loglinear models of different types.
> >     I don't really know how to 'compute' with model formulas, so I've
> >     done this in the notation
> >     for stats::loglin(), which is a list of high-order terms in the model.
> >
> >     What I'd like is a function to turn the results of these into a
> >     model formula, suitable for
> >     MASS::loglm.  That's the reverse of what loglm does.
> >
> >     For example, the simplest versions of models for 3-way tables for
> >     joint,
> >      conditional, and marginal independence can be computed as
> >     follows. After each, I indicated
> >     the WANTED model formula I'd like from the result
> >
> >     > joint(3)
> >     $term1
> >     [1] 1 2
> >
> >     $term2
> >     [1] 3
> >
> >     WANTED:  ~ 1:2 + 3
> >
> >     > condit(3)
> >     $term1
> >     [1] 1 3
> >
> >     $term2
> >     [1] 2 3
> >
> >     WANTED: ~ 1:2 + 2:3
> >
> >     > mutual(3)
> >     $term1
> >     [1] 1
> >
> >     $term2
> >     [1] 2
> >
> >     $term3
> >     [1] 3
> >
> >     WANTED: ~ 1 + 2 + 3
> >
> >     In case anyone want to play with the code, here are the current,
> >     not too elegant definitions
> >     of the functions, and some further test cases,
> >
> >     # models of joint independence
> >       joint <- function(nf, factors=1:nf, with=nf) {
> >         if (nf == 1) return (list(term1=factors[1]))
> >         if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
> >         others <- setdiff(1:nf, with)
> >         result <- list(term1=factors[others], term2=factors[with])
> >         result
> >       }
> >     # conditional independence
> >       condit <- function(nf, factors=1:nf, with=nf) {
> >         if (nf == 1) return (list(term1=factors[1]))
> >         if (nf == 2) return (list(term1=factors[1], term2=factors[2]))
> >         main <- setdiff(1:nf, with)
> >         others <- matrix(factors[with], length(with), length(main))
> >         result <- rbind(factors[main], others)
> >         result <- as.list(as.data.frame(result, stringsAsFactors=FALSE))
> >         names(result) <- paste('term', 1:length(result), sep='')
> >         result
> >       }
> >     # mutual independence
> >       mutual <- function(nf, factors=1:nf) {
> >         result <- sapply(factors[1:nf], list)
> >         names(result) <- paste('term', 1:length(result), sep='')
> >         result
> >       }
> >
> >     ### some comparisons
> >
> >     loglin(HairEyeColor, list(c(1, 2), c(1, 3), c(2, 3)))$lrt
> >     loglm(~1:2 + 1:3 +2:3, HairEyeColor)
> >
> >     # use factor names
> >     joint(3, factors=names(dimnames(HairEyeColor)))
> >     condit(3, factors=names(dimnames(HairEyeColor)))
> >
> >     loglin(HairEyeColor, joint(3))$lrt
> >     loglm(~1:2 + 3, HairEyeColor)
> >
> >     loglin(HairEyeColor, condit(3))$lrt
> >     loglm(~1:3 + 2:3, HairEyeColor)
> >
> >
> >
> >     --
> >     Michael Friendly     Email: friendly AT yorku DOT ca
> >     Professor, Psychology Dept. & Chair, Quantitative Methods
> >     York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
> >     4700 Keele Street    Web: http://www.datavis.ca
> >     Toronto, ONT  M3J 1P3 CANADA
> >
> >     ______________________________________________
> >     R-help at r-project.org <mailto:R-help at r-project.org> mailing list
> >     https://stat.ethz.ch/mailman/listinfo/r-help
> >     PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >     and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> >
> > --
> > Henrique Dallazuanna
> > Curitiba-Paraná-Brasil
> > 25° 25' 40" S 49° 16' 22" O
> 
> 
> --
> Michael Friendly     Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, Quantitative Methods
> York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
> 4700 Keele Street    Web:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
> 
> 
> 	[[alternative HTML version deleted]]



More information about the R-help mailing list