[R] Using merge can convert character variables to factor

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Thu May 16 21:56:50 CEST 2002


On Thu, 16 May 2002, David Kane  <David Kane wrote:

> I am not sure if this is a bug or a feature, but I could not find it
> documented. In certain circumstances, using merge can convert a character
> variable to factor. Consider a simple example:
>
> > x <- data.frame(a = 1:4)
> > y <- data.frame(b = LETTERS[1:3])
> > z <- merge(x, y, by = 0)
> > unlist(lapply(z, data.class))
> Row.names         a         b
>  "factor" "integer"  "factor"
>
> So far, so good. b should be a factor since it is a factor in y.
>
> > is.factor(y$b)
> [1] TRUE
>
> Changing b to be a charcter variable works as well.
>
> > y$b <- as.character(y$b)
> > z <- merge(x, y, by = 0)
> > unlist(lapply(z, data.class))
>   Row.names           a           b
>    "factor"   "integer" "character"
>
> But when we change the merge to include all of the x rows in the resulting
> dataframe, we get:
>
> > z <- merge(x, y, by = 0, all.x = TRUE)
> > unlist(lapply(z, data.class))
> Row.names         a         b
>  "factor" "integer"  "factor"

sapply would be better, BTW.

> I think that b should still be a character variable in this case.

Well, b is character in x and factor in y. Your example is inconsistent:
perhaps it should complain at you?

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list