[R] how to apply the dummy coding rule in a dataframe with complete factor levels to another dataframe with incomplete factor levels?

Kingsford Jones kingsfordjones at gmail.com
Sat Jun 20 18:34:39 CEST 2009


Hi Sean,

The levels attribute of a factor can contain levels that are not
represented in the data.  So, in your example we can get the desired
result by adding the missing levels via the levels argument to the
factor function:

> dfB =data.frame(f1=factor(c('a','b','b'), levels=c('a','b','c')), f2=factor(c('aa','bb','bb'), levels=c('aa','bb','cc')))
> model.matrix(~f1+f2, data=dfB)
  (Intercept) f1b f1c f2bb f2cc
1           1   0   0    0    0
2           1   1   0    1    0
3           1   1   0    1    0
attr(,"assign")
[1] 0 1 1 2 2
attr(,"contrasts")
attr(,"contrasts")$f1
[1] "contr.treatment"

attr(,"contrasts")$f2
[1] "contr.treatment"



hth,
Kingsford Jones



On Fri, Jun 19, 2009 at 10:13 PM, Sean Zhang<seanecon at gmail.com> wrote:
> Dear R helpers:
>
> Sorry to bother for a basic question about model.matrix.
> Basically, I want to apply the dummy coding rule in a dataframe with
> complete factor levels to another dataframe with incomplete factor levels.
> I used model.matrix, but could not get what I want.
> The following is an example.
>
> #Suppose I have two dataframe A and B
> dfA=data.frame(f1=factor(c('a','b','c')), f2=factor(c('aa','bb','cc')))
> dfB =data.frame(f1=factor(c('a','b','b')), f2=factor(c('aa','bb','bb')))
> #dfB's factor variables have less number of levels
>
> #use model.matrix on dfA
> (matA<-model.matrix(~f1+f2,data=dfA))
> #use model.matrix on dfB
> (matB<-model.matrix(~f1+f2,data=dfB))
> #I actaully like to dummy code dfB using the dummy coding rule defined in
> model.matrix(~f1+f2,data=dfA))
> #matB_wanted  is below
> (matB_wanted<-rbind(c(1,0,0,0,0),c(1,1,0,1,0),c(1,1,0,1,0)) )
> colnames(matB_wanted)<-colnames(matA)
> matB_wanted
> Can someone kindly show me how to get matB_wanted?
> Many thanks in advance!
>
> -Sean
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list