[R] multi-column factor

Sam Steingold sds at gnu.org
Sun Sep 16 18:46:02 CEST 2012


I have a data frame with columns which draw on the same underlying
universe, so I want them to be factors with the same level set:

--8<---------------cut here---------------start------------->8---
> z <- data.frame(a=c("a","b","c"),b=c("b","c","d"),stringsAsFactors=FALSE)
> str(z)
'data.frame':	3 obs. of  2 variables:
 $ a: chr  "a" "b" "c"
 $ b: chr  "b" "c" "d"
> z$a <- factor(z$a,levels=union(z$a,z$b))
> z$b <- factor(z$b,levels=union(z$a,z$b))
> str(z)
'data.frame':	3 obs. of  2 variables:
 $ a: Factor w/ 4 levels "a","b","c","d": 1 2 3
 $ b: Factor w/ 4 levels "a","b","c","d": 2 3 4
--8<---------------cut here---------------end--------------->8---
factor(z$a,levels=union(z$a,z$b))
is factor(z$a,levels=union(z$a,z$b)) the right way to handle this?
maybe there is a better way to extract levels than union()?
(bear in mind that I have ~10M rows and ~1M levels, so performance is an
issue).

Thanks!

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://iris.org.il http://honestreporting.com
http://camera.org http://www.memritv.org http://jihadwatch.org
When you talk to God, it's prayer; when He talks to you, it's schizophrenia.



More information about the R-help mailing list