[Rd] Inconsistency in as.data.frame.table for stringsAsFactors

Peter Dalgaard p.dalgaard at biostat.ku.dk
Sat Jan 23 12:12:54 CET 2010


Stavros Macrakis wrote:
> Martin,
> 
> I agree that global options settings that affect computations are
> problematic.
> 
> But that's not the issue I was addressing.  If for some classes, func.CLASS
> has certain defaults for some arguments, it is surprising that for other
> classes, it has different defaults, whether these defaults are fixed or
> taken from global settings -- when there is no obvious reason for the
> default to vary by class.
> 
>           -s

"A foolish consistency is the hobgoblin of little minds..."

The thing is that if you are converting the classifying factors of a 
table to columns of a data frame, you will presumably prefer that they 
come out as factors, retaining level order. The alternative is like this:

 >  (x <- as.table(c("Rare"=5, "Medium"=2, "Well-done"=6)))
      Rare    Medium Well-done
         5         2         6
 > df <- as.data.frame(x, stringsAsFactors=F)
 > xtabs(Freq~Var1, data=df)
Var1
    Medium      Rare Well-done
         2         5         6

This is completely different from other cases, where as.data.frame will 
auto-convert character variables to factors; e.g., on reading. Having a 
global option intended for read.table() interfere with the above kind of 
operation, could be a really nasty surprise for the user. (Notice also 
that the option was introduced in 2.10.0, before then, noone would 
expect that classifying factors could come out as non-factors. 
Defaulting to the global option could easily break working code.)

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-devel mailing list