[R] Centering multi-level unordered factors

David Winsemius dwinsemius at comcast.net
Tue Oct 8 03:53:40 CEST 2013


On Oct 7, 2013, at 4:52 PM, Robert Lynch wrote:

> I have a question I am not even sure quite how to ask.
> 
> When r fits models with  un-ordered categorical variables as predictors
> (RHS of model) it automatically converts them into 1 less dichotomous
> variables than there are levels.
> 
> For example  if I had levels(trait) = ("A","B","C") it would automatically
> recode to
>          NewVar1 NewVar2
> A         0               0
> B         1               0
> C          0               1
> 
> What I would like to know is, is there a way that I can "center" these
> categorical variables, and if so how
> 
> for continuous variables it is simple
> x <- x-mean(x)

You can choose different contrasts. Take a look at contr.sum()

> trait <- factor(1:3, labels = c("A","B","C"))

> contrasts(trait) <- contr.sum(3)
> model.matrix( ~trait )
  (Intercept) trait1 trait2
1           1      1      0
2           1      0      1
3           1     -1     -1
attr(,"assign")
[1] 0 1 1
attr(,"contrasts")
attr(,"contrasts")$trait
  [,1] [,2]
A    1    0
B    0    1
C   -1   -1

-- 
David.

> 
> for a single dichotomous variable it is not so hard
> gender <- gender - sum(gender)/length(gender)
> where the gender are (0,1) or (-.5,.5) for  example
> which would give  gender coefficients in a model  that would still reflect
> the difference between the two genders but the intercept and the other
> coefficients would be for some one of "average gender"
> 
> and it is that last part that I am unclear on for a multi (3 or more) level
> factor.  How do you set up variables so that the *other* coefficients
> reflect the average across the factor levels. Do I need two or three
> centered variables? and is there a quick way to get at all those variables
> if my factor has many levels, e.g. 14?
> 
> 
> Robert
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list