[R] passing constrasts=FALSE to contrast functions -- why does this exist?
njs at pobox.com
Fri Jun 11 19:29:02 CEST 2010
I've noticed that all contrast functions, like contr.treatment,
contr.poly, etc., take a logical argument called 'contrasts'. The
default is TRUE, in which case they do their normal thing of returning
a n x n-1 matrix whose columns are linearly-independent of the
If contrasts=FALSE, they instead return an n x n matrix with full rank
(usually the identity matrix, corresponding to "dummy" coding, but
contr.poly returns orthogonal polynomials that include the zero-th
order constant term, instead of starting with the linear term as it
Why does this argument exist?
My initial theory was that this was added to support the smart
handling of redundancy in model matrix construction -- depending on
what other terms exist in a formula, sometimes R will choose to
contrast code a factor in n-1 columns, and sometimes it will choose to
dummy code it in n columns. So it would make sense to call the
contrast function with contrasts=TRUE in the former case and
contrasts=FALSE in the latter case, and that way if the contrast
function for some reason wanted a full-rank coding *besides* dummy
coding then it could do that (like contr.poly).
But in fact, when R decides it wants dummy coding, it doesn't call the
contrast function, it just dummy codes unconditionally:
> a <- factor(c("a", "b", "c"))
> invisible(model.matrix(~ a)) # contrast coded
trace: ctrfn(levels(x), contrasts = contrasts)
> invisible(model.matrix(~ 0 + a)) # dummy coded
In fact, I can't find any code anywhere in R that ever uses contrasts=FALSE.
So what's going on? Is this a bug and R *should* be using
contrasts=FALSE to "dummy code" factors?
More information about the R-help