[Rd] logical variables in models
    Fox, John 
    jfox @ending from mcm@@ter@c@
       
    Wed Dec 19 16:19:27 CET 2018
    
    
  
Dear R-devel list members,
This is an observation about how logical variables in models are handled, followed by questions.
As a general matter, character variables and logical variables are treated as if they were factors when they appear on the RHS of a model formula; for example:
- - - - snip- - - - -
> set.seed(123)
> c <- sample(letters[1:3], 10, replace=TRUE)
> f <- as.factor(sample(LETTERS[1:3], 10, replace=TRUE))
> L <- sample(c(TRUE, FALSE), 10, replace=TRUE)
> y <- rnorm(10)
> options(contrasts=c("contr.sum", "contr.poly"))
> mod <- lm(y ~ c + f + L)
> model.matrix(mod)
   (Intercept) c1 c2 f1 f2 L1
1            1  1  0 -1 -1  1
2            1 -1 -1  0  1  1
3            1  0  1 -1 -1  1
4            1 -1 -1  0  1  1
5            1 -1 -1  1  0  1
6            1  1  0 -1 -1  1
7            1  0  1  1  0  1
8            1 -1 -1  1  0  1
9            1  0  1  1  0 -1
10           1  0  1 -1 -1 -1
attr(,"assign")
[1] 0 1 1 2 2 3
attr(,"contrasts")
attr(,"contrasts")$c
[1] "contr.sum"
attr(,"contrasts")$f
[1] "contr.sum"
attr(,"contrasts")$L
[1] “contr.sum"
- - - - snip- - - - -
But logical variables don’t appear in the $xlevels component of the objects created by lm() and similar functions:
- - - - snip- - - - -
> mod$xlevels
$c
[1] "a" "b" "c"
$f
[1] "A" "B" “C"
- - - - snip- - - - -
Why the discrepancy? It’s true that the level-set (i.e., TRUE, FALSE) for a logical “factor” is known, but examining the $levels component is a simple way to detect variables treated as factors in the model. For example, I’d argue that .getXlevels() returns misleading information:
- - - - snip- - - - -
> .getXlevels(terms(mod), model.frame(mod))
$c
[1] "a" "b" "c"
$f
[1] "A" "B" “C"
- - - - snip- - - - -
An alternative for detecting “factors” is to examine the 'contrasts' attribute of the model matrix, although that doesn’t produce levels:
- - - - snip- - - - -
> names(attr(model.matrix(mod), "contrasts"))
[1] "c" "f" "L"
- - - - snip- - - - -
Is there are argument against making the treatment of logical variables consistent with that of factors and character variables? Comments?
Best,
 John
  -------------------------------------------------
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox
    
    
More information about the R-devel
mailing list