[R] What is behind class coercion of a factor into a character
rolf.turner at xtra.co.nz
Mon Oct 22 21:28:34 CEST 2012
WARNING: Use with caution!
There is a way to effect the catenation of factors: The data.frame
method for rbind() does this. E.g.
f1 <- factor(sample(letters[1:3],42,TRUE))
f2 <- factor(sample(letters[1:4],66,TRUE))
d1 <- data.frame(f=f1)
d2 <- data.frame(f=f2)
dd <- rbind(d1,d2)
ff <- dd[,1]
et voila, ff is the "desired" catenation of f1 and f2.
But heed Bert's words of caution below!
On 23/10/12 02:58, Bert Gunter wrote:
> There was a recent discussion on this list about this (Sam Steingold
> was the OP IIRC).
> The issue is ?c . In particular:
> "c is sometimes used for its side effect of removing attributes except
> names, for example to turn an array into a vector."
> Hence, the factor attribute is removed and you get what you saw. As
> regards it's "rationale," you may find Bill Dunlap's comments on
> "c()'s unfortunate history" relevant. The problem with factors is
> "what should concatenation do, anyway?" If a <- factor(c("x", "y"))
> and b <- factor(c("y", "z")), what should c(a,b) be? -- There is no
> reason to assume that the "y" in a is the same as the "y" in b!
> On Mon, Oct 22, 2012 at 6:46 AM, Tal Galili <tal.galili at gmail.com> wrote:
>> Hello all,
>> Please review the following simple code:
>> # make a factor:
>> x <- factor(c("one", "two"))
>> # what should be the output to the following expression?
>> c(x, "3") # <=== ????
>> # I expected it to be as the output of:
>> c(as.character(x), "3")
>> # But in fact, the output is what would happen if we had ran the
>> next line:
>> c(as.character(as.numeric(x)), "3")
>> # p.s: c(x, 3) would of course behave differently...
>> I imagine the above behavior is a "feature" (not a bug), but I am curious
>> as to what is the rational behind it. Is it because of computational
>> efficiency, or something that fixes some case study?
More information about the R-help