[R] abbreviating words in a model formula

Michael Friendly friendly at yorku.ca
Mon Jul 8 19:36:28 CEST 2013


For an application, I need to get a character string representation of 
the formula or
model call for glm objects, but also, for labeling output and plots, I 
want to be able
to abbreviate the words (variables) in model terms.  This requires some 
formula
magic that I can't quite get, in particular extracting the terms from a 
formula and
then the words in each term.

Perhaps there is some code for something similar
I haven't found yet, or someone can suggest how to do this.

A runnable example to show what I mean:

Freq <- c(68,42,42,30, 37,52,24,43,
     66,50,33,23, 47,55,23,47,
     63,53,29,27, 57,49,19,29)

Temperature <- gl(2, 2, 24, labels = c("Low", "High"))
Softness <- gl(3, 8, 24, labels = c("Hard","Medium","Soft"))
M.user <- gl(2, 4, 24, labels = c("N", "Y"))
Brand <- gl(2, 1, 24, labels = c("X", "M"))

detg <- data.frame(Freq,Temperature, Softness, M.user, Brand)
detg.m0 <- glm(Freq ~ M.user*Temperature*Softness + 
Brand*M.user*Temperature,
        family = poisson, data = detg)

detg.m1 <- glm(Freq ~ (M.user + Temperature + Softness + Brand),
        family = poisson, data=detg)

detg.m2 <- glm(Freq ~ (M.user + Temperature + Softness + Brand)^2,
        family = poisson, data=detg)

detg.m2a <- update(detg.m1, . ~ .^2)

In plot.lm, I found the following code to extract the model call from a 
glm object as
a string and abbreviate it to a total length <=75.  I need shorter total 
length,
by abbreviating individual words in the model call, so the approach has to
at least extract the terms in the model and then abbreviate the words in 
each term.

# from plot.lm: get model call as a string
# TODO: how to use abbreviate to abbreviate the words in the model terms???
mod.call <- function(x, max.len=75) {
         cal <- x$call
         if (!is.na(m.f <- match("formula", names(cal)))) {
             cal <- cal[c(1, m.f)]
             names(cal)[2L] <- ""
         }
         cc <- deparse(cal, max.len+5)
         nc <- nchar(cc[1L], "c")
         abbr <- length(cc) > 1 || nc > max.len
         cap <- if (abbr)
             paste(substr(cc[1L], 1L, min(max.len, nc)), "...")
         else cc[1L]
         cap
}

Tests, & WANTED, say with max length of each word in the string <= 6 & 
maximum total
length <= 40

 > mod.call(detg.m0)
[1] "glm(Freq ~ M.user * Temperature * Softness + Brand * M.user * 
Temperature)"

WANTED, somthing like:
"glm(Freq ~ M.user * Temp * Softne + Brand * M.user * Temp)"

 > mod.call(detg.m2a)
[1] "glm(Freq ~ M.user + Temperature + Softness + Brand + 
M.user:Temperature + M ..."
 >
 > mod.call(detg.m2a, max.len=200)
[1] "glm(Freq ~ M.user + Temperature + Softness + Brand + 
M.user:Temperature + M.user:Softness + M.user:Brand + 
Temperature:Softness + Temperature:Brand + Softness:Brand)"
 >

WANTED, somthing closer to
"glm(Freq ~ M + Tmp + Sft + Brnd + M:Tmp + M.:Sft + M.us:Brnd + Tmp:Sft 
+ Tmp:Brnd + Sft:Brnd)"

TIA
-Michael



-- 
Michael Friendly     Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, Quantitative Methods
York University      Voice: 416 736-2100 x66249 Fax: 416 736-5814
4700 Keele Street    Web:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA



More information about the R-help mailing list