[Rd] "+" for character method...

Bill Dunlap bill at insightful.com
Sat Aug 26 00:52:27 CEST 2006


> >     >> There have been propositions to make "+" work in S (and
> >     >> R) like in some other languages, namely for character
> >     >> (vectors),
> >     >>
> >     >> a + b := paste(a,b, sep="")
> > ...
> > yes.  I think however if we keep speed and clarity and catching
> > user errors all in mind, it would be enough - and better - to
> > only dispatch to paste(.,.) when both arguments are character
> > (vectors), i.e., the above case needed
> >  "a" + as.character(1:7) or "a" + paste(1:7) or "a" + format(1:7)
> > which after all is really more clearer, even more for cases of
> >  "1" + 2  which I'd rather want keeping to give errors.
> >
> > If  Char + Num  should work like above, then also
> >     Num + Char  should (since after all,  "+" should be commutative
> > 			apart from floating point precision issues).
> >
> > and so the internal C code gets a bit more complicated and slightly
> > slower..  something we had in mind we should strongly avoid...
>
> I doubt that it would be measurably slower, but I agree that requiring
> both args to be Char could be done in fewer operations than just
> requiring one.
>
> However, I think the consistency argument is stronger.  We have a rule
> that operations on mixed types promote the more restrictive type to the
> less restrictive one, and I don't think we should handle this case
> differently.
>
> So I'd say we should allow all of Char + Num, Num + Char, and Char +
> Char, or, if this costs too much at evaluation time, we shouldn't allow
> any of them.

Currently doing arithmetic on mixed class data.frames
produces useful warnings and errors.  E.g.,

  > z <- data.frame(Factor=factor(c("Lo","Med","High")),
                  Char=letters[1:3],
                  Num1=exp(0:2),
                  Num2=(1:3)*pi,
                  stringsAsFactors=FALSE)
  > z+1
  Error in FUN(left, right) : non-numeric argument to binary operator
  In addition: Warning message:
  + not meaningful for factors in: Ops.factor(left, right)
  > z[,-2] + 1
    Factor     Num1      Num2
  1     NA 2.000000  4.141593
  2     NA 3.718282  7.283185
  3     NA 8.389056 10.424778
  Warning message:
  + not meaningful for factors in: Ops.factor(left, right)

If we made + do paste(sep="") for character+number then
we would lose the messages and let garbage flow further
down the pipe.

Should factor data be treated as character data in this
case (e.g., pasting to the levels)?  That would be weird,
but many users confound character and factor data when
they are buried in data.frames.

----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."



More information about the R-devel mailing list