[R] "too large for hashing"

Adam D. I. Kramer adik-rhelp at ilovebacon.org
Thu Apr 5 20:03:00 CEST 2012


Hello,

 	I'm doing some analysis on a rather large data set. In this case,
some simple commands are failing. For example, this one:

> x$eventtype <- factor(x$eventtype)
Error in unique.default(x) : length 1093574297 is too large for hashing

...I think this is a bug, because "hashing" should not be required for the
"factor" function. Am I right? The whole column does not need to be hashed,
only the unique keys. Sure, there is the potential to overflow the key
register, but this error should be thrown only if that occurs, no?

Cordially,
Adam D. I. Kramer, Ph.D.
Data Scientist, Facebook, Inc.
akramer at fb.com



More information about the R-help mailing list