[R] abbreviating words in a model formula

William Dunlap wdunlap at tibco.com
Mon Jul 8 20:54:54 CEST 2013


The call to all.names() below probably should have the unique=TRUE argument.
It doesn't make any difference in this particular code, but having repeated names
could cause problems in related code.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> Of William Dunlap
> Sent: Monday, July 08, 2013 11:02 AM
> To: Michael Friendly; R-help
> Subject: Re: [R] abbreviating words in a model formula
> 
> Try using all.names() to get all the names in the formula.  E.g.,
> 
> f <- function (formula, minNameLength = 2, abbreviateFunctionNames = FALSE)
> {
>     names <- all.names(formula, functions = abbreviateFunctionNames)
>     abbrNames <- lapply(abbreviate(names, minlength = minNameLength),
>         as.name)
>     deparse(do.call("substitute", list(formula, abbrNames)))
> }
> 
> used as
>   > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor))
>   [1] "MR ~ log(FP) + sqrt(SP)"
>   > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor), min=4)
>   [1] "MyRs ~ log(FrsP) + sqrt(ScnP)"
>   > f(MyResponse ~ log(FirstPredictor) + sqrt(SecondPredictor),
> abbreviateFunctionNames=TRUE)
>   [1] "MR ~ lg(FP) + sq(SP)"
> 
> You could put that in a loop that stopped when nchar(f(...)) got small enough.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
> > Of Michael Friendly
> > Sent: Monday, July 08, 2013 10:36 AM
> > To: R-help
> > Subject: [R] abbreviating words in a model formula
> >
> > For an application, I need to get a character string representation of
> > the formula or
> > model call for glm objects, but also, for labeling output and plots, I
> > want to be able
> > to abbreviate the words (variables) in model terms.  This requires some
> > formula
> > magic that I can't quite get, in particular extracting the terms from a
> > formula and
> > then the words in each term.
> >
> > Perhaps there is some code for something similar
> > I haven't found yet, or someone can suggest how to do this.
> >
> > A runnable example to show what I mean:
> >
> > Freq <- c(68,42,42,30, 37,52,24,43,
> >      66,50,33,23, 47,55,23,47,
> >      63,53,29,27, 57,49,19,29)
> >
> > Temperature <- gl(2, 2, 24, labels = c("Low", "High"))
> > Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft"))
> > M.user <- gl(2, 4, 24, labels = c("N", "Y"))
> > Brand <- gl(2, 1, 24, labels = c("X", "M"))
> >
> > detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
> > detg.m0 <- glm(Freq ~ M.user*Temperature*Softness +
> > Brand*M.user*Temperature,
> >         family = poisson, data = detg)
> >
> > detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
> >         family = poisson, data=detg)
> >
> > detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
> >         family = poisson, data=detg)
> >
> > detg.m2a <- update(detg.m1, . ~ .^2)
> >
> > In plot.lm, I found the following code to extract the model call from a
> > glm object as
> > a string and abbreviate it to a total length <=75.  I need shorter total
> > length,
> > by abbreviating individual words in the model call, so the approach has to
> > at least extract the terms in the model and then abbreviate the words in
> > each term.
> >
> > # from plot.lm: get model call as a string
> > # TODO: how to use abbreviate to abbreviate the words in the model terms???
> > mod.call <- function(x, max.len=75) {
> >          cal <- x$call
> >          if (!is.na(m.f <- match("formula", names(cal)))) {
> >              cal <- cal[c(1, m.f)]
> >              names(cal)[2L] <- ""
> >          }
> >          cc <- deparse(cal, max.len+5)
> >          nc <- nchar(cc[1L], "c")
> >          abbr <- length(cc) > 1 || nc > max.len
> >          cap <- if (abbr)
> >              paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
> >          else cc[1L]
> >          cap
> > }
> >
> > Tests, & WANTED, say with max length of each word in the string <= 6 &
> > maximum total
> > length <= 40
> >
> >  > mod.call(detg.m0)
> > [1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user *
> > Temperature)"
> >
> > WANTED, somthing like:
> > "glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"
> >
> >  > mod.call(detg.m2a)
> > [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> > M.user:Temperature + M ..."
> >  >
> >  > mod.call(detg.m2a, max.len=200)
> > [1] "glm(Freq ~ M.user + Temperature + Softness + Brand +
> > M.user:Temperature + M.user:Softness + M.user:Brand +
> > Temperature:Softness + Temperature:Brand + Softness:Brand)"
> >  >
> >
> > WANTED, somthing closer to
> > "glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft
> > + Tmp:Brnd + Sft:Brnd)"
> >
> > TIA
> > -Michael
> >
> >
> >
> > --
> > Michael Friendly     Email: friendly AT yorku DOT ca
> > Professor, Psychology Dept. & Chair, Quantitative Methods
> > York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
> > 4700 Keele Street    Web:   http://www.datavis.ca
> > Toronto, ONT  M3J 1P3 CANADA
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list