[Rd] named arguments in formula and terms

Achim Zeileis Achim.Zeileis at R-project.org
Fri Mar 10 15:02:38 CET 2017


Hi, we came across the following unexpected (for us) behavior in 
terms.formula: When determining whether a term is duplicated, only the 
order of the arguments in function calls seems to be checked but not their 
names. Thus the terms f(x, a = z) and f(x, b = z) are deemed to be 
duplicated and one of the terms is thus dropped.

R> attr(terms(y ~ f(x, a = z) + f(x, b = z)), "term.labels")
[1] "f(x, a = z)"

However, changing the arguments or the order of arguments keeps both 
terms:

R> attr(terms(y ~ f(x, a = z) + f(x, b = zz)), "term.labels")
[1] "f(x, a = z)"  "f(x, b = zz)"
R> attr(terms(y ~ f(x, a = z) + f(b = z, x)), "term.labels")
[1] "f(x, a = z)" "f(b = z, x)"

Is this intended behavior or needed for certain terms?

We came across this problem when setting up certain smooth regressors with 
different kinds of patterns. As a trivial simplified example we can 
generate the same kind of problem with rep(). Consider the two dummy 
variables rep(x = 0:1, each = 4) and rep(x = 0:1, times = 4). With the 
response y = 1:8 I get:

R> lm((1:8) ~ rep(x = 0:1, each = 4) + rep(x = 0:1, times = 4))

Call:
lm(formula = (1:8) ~ rep(x = 0:1, each = 4) + rep(x = 0:1, times = 4))

Coefficients:
            (Intercept)  rep(x = 0:1, each = 4)
                    2.5                     4.0

So while the model is identified because the two regressors are not the 
same, terms.fomula does not recognize this and drops the second regressor. 
What I would have wanted can be obtained by switching the arguments:

R> lm((1:8) ~ rep(each = 4, x = 0:1) + rep(x = 0:1, times = 4))

Call:
lm(formula = (1:8) ~ rep(each = 4, x = 0:1) + rep(x = 0:1, times = 4))

Coefficients:
             (Intercept)   rep(each = 4, x = 0:1)  rep(x = 0:1, times = 4)
                       2                        4                        1

Of course, here I could avoid the problem by setting up proper factors 
etc. But to me this looks a potential bug in terms.formula...

Thanks in advance for any insights,
Z



More information about the R-devel mailing list