[R] / Operator not meaningful for factors

Don MacQueen macq at llnl.gov
Tue May 4 08:14:12 CEST 2010

At 3:50 PM -0700 5/3/10, John Kane wrote:
>  I think that you are correct.  R has the annoying habit of 
>converting character data to factors when you don't want it to while 
>it is importing data.  This is because the in the option 
>"stringsAsFactors" is set to TRUE for some weird historical reasons.

Well, "annoying" is in the eye of the beholder. The reason is not 
weird at all; the original S language, upon which R is based, was 
designed first for statistical analysis. When the language was 
expanded to include advanced modeling capabilities (linear models, 
generalized linear models, and more) it became apparent that factors 
are the appropriate form for using categorical data in such models. 
it is still the "R Project for Statistical Computing" (see the R home 
page), so the default is unchanged.

Hence, when users get factors when they were expecting numbers, it's 
virtually always because the have some non-numeric character strings 
mixed in with the data. R then defaults to interpreting it as 
categorical data, represented as a factor.

Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
macq at llnl.gov

More information about the R-help mailing list