[R] Using merge can convert character variables to factor

David Kane <David Kane a296180 at mica.fmr.com
Thu May 16 21:24:43 CEST 2002

I am not sure if this is a bug or a feature, but I could not find it
documented. In certain circumstances, using merge can convert a character
variable to factor. Consider a simple example:

> x <- data.frame(a = 1:4)
> y <- data.frame(b = LETTERS[1:3])
> z <- merge(x, y, by = 0)
> unlist(lapply(z, data.class))
Row.names         a         b 
 "factor" "integer"  "factor" 

So far, so good. b should be a factor since it is a factor in y.

> is.factor(y$b)
[1] TRUE

Changing b to be a charcter variable works as well.

> y$b <- as.character(y$b)
> z <- merge(x, y, by = 0)
> unlist(lapply(z, data.class))
  Row.names           a           b 
   "factor"   "integer" "character" 

But when we change the merge to include all of the x rows in the resulting
dataframe, we get:

> z <- merge(x, y, by = 0, all.x = TRUE)
> unlist(lapply(z, data.class))
Row.names         a         b 
 "factor" "integer"  "factor" 

I think that b should still be a character variable in this case.

If this is a bug, please let me know and I would be happy to submit it.

> R.version
platform sparc-sun-solaris2.6
arch     sparc               
os       solaris2.6          
system   sparc, solaris2.6   
major    1                   
minor    5.0                 
year     2002                
month    04                  
day      29                  
language R                   


Dave Kane
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list