[Rd] Inconsistency in as.data.frame.table for stringsAsFactors

Terry Therneau therneau at mayo.edu
Mon Jan 25 16:36:19 CET 2010


Kudos to Peter for actually answering the question of why the
inconsistency was there.  It might be well to add a bit to the
documentation.

  As to the larger discussion of global defaults let me offer two
opinions:
  1. They are the salvation of those of us who do not agree with certain
global defaults.
   -- 'best practice' is not always a consensus
   -- defaults are often informed too much by "the data we happened to
be analyising when we decided the default".  The long-standing
contrast.helmert one for instance; a look at the white book shows that
they were working on an orthagonal manufacturing design, the one case
where Helmert contrasts make sense.  The survival package contains
several defaults with the same type of origin.

  2. People in these discussions play the "it might break something"
card far too often.  At Mayo, for instance, the table() command has been
replaced by one which lists NA by default, for all data types.  We've
done this for as long as R and Splus have been used (10+ years), for all
150 people in the biostat group, and nothing has broken yet.  A
suggestion to allow this as a global default will immediately elicit the
above argument, I guarrantee it.  Ditto for our experience with
stringsAsFactors=FALSE; nothing's broken yet.  Give a concrete example
before crying wolf.

Terry T



More information about the R-devel mailing list