[Rd] Why is there no c.factor?

Matthew Dowle mdowle at mdowle.plus.com
Thu Feb 4 19:42:33 CET 2010


A search for "c.factor" returns tons of hits on this topic.

Heres just one of the hits from 2006, when I asked the same question : 
http://tolstoy.newcastle.edu.au/R/e2/devel/06/11/1137.html

So it appears to be complicated and there are good reasons.
Since I needed it, I created c.factor in data.table package, below. It does 
it more efficiently since it doesn't convert each factor to character (hence 
losing some of the benefit). I've been told I'm not unique in this approach 
and that other packages also have their own c.factor.  It deliberately isn't 
exported.  Its worked well for me over the years anyway.

c.factor = function(...)
{
    args <- list(...)
    for (i in seq(along=args)) if (!is.factor(args[[i]])) args[[i]] = 
as.factor(args[[i]])
    # The first must be factor otherwise we wouldn't be inside c.factor, its 
checked anyway in the line above.
    newlevels = sort(unique(unlist(lapply(args,levels))))
    ans = unlist(lapply(args, function(x) {
        m = match(levels(x), newlevels)
        m[as.integer(x)]
    }))
    levels(ans) = newlevels
    class(ans) = "factor"
    ans
}

"Hadley Wickham" <hadley at rice.edu> wrote in message 
news:f8e6ff051002040753x33282f33l78fce9f98dc29ae8 at mail.gmail.com...
> Hi all,
>
> Is there are reason that there is no c.factor method?  Analogous to
> c.Date, I'd expect something like the following to be useful:
>
> c.factor <- function(...) {
>  factors <- list(...)
>  levels <- unique(unlist(lapply(factors, levels)))
>  char <- unlist(lapply(factors, as.character))
>
>  factor(char, levels = levels)
> }
>
> c(factor("a"), factor("b"), factor(c("c", "b","a")), factor("d"))
> # [1] a b c b a d
> # Levels: a b c d
>
> Hadley
>
> -- 
> http://had.co.nz/
>



More information about the R-devel mailing list