[Rd] Improving string concatenation

Martin Maechler maechler at stat.math.ethz.ch
Fri Jun 19 12:31:28 CEST 2015


>>>>> Radford Neal <radford at cs.toronto.edu>
>>>>>     on Thu, 18 Jun 2015 14:32:18 -0400 writes:

    > Gabor Csardi writes:
    >> Btw. for some motivation, here is a (surely incomplete)
    >> list of languages with '+' as the string concatenation
    >> operator:
    >> 
    >> ALGOL 68, BASIC, C++, C#, Cobra, Pascal, Object Pascal,
    >> Eiffel, Go, JavaScript, Java, Python, Turing, Ruby,
    >> Windows Powers hell, Objective-C, F#, Sc-ala, Ya.


    > The situation for R is rather different from that of a
    > language (as many of the above) in which variables are
    > declared to be of a specific type.

    > In such a statically typed language, when you see the
    > expression "a+b", it is easy to figure out whether the "+"
    > will be numeric addition or string concatenation, by
    > looking at the declarations of "a" and "b".

    > But in a language such as R in which values have types,
    > but variables don't, someone seeing "a+b" in code wouldn't
    > be able to tell easily what it does.  This is OK, in fact
    > desirable, in the case of operator dispatch according to
    > class when the different methods implement versions of the
    > operator that have analogous properties.  But numeric
    > addition and string concatenation have just about nothing
    > in common, so cases where functions are meant to have "+"
    > be either addition OR concatenation are going to be rare.

    > Furthermore, making "+" concatenate strings would preclude
    > ever making "+" convert strings to numbers (signalling an
    > error if they aren't in some numerical format) and then
    > add them.  I'm not sure whether that would be a good idea
    > or not, but it might be unwise to eliminate the
    > possibility.

    > And of course, as someone else mentioned, it may instead
    > be desirable for attempts to add strings to signal an
    > error, as at present, which one also gives up by making
    > "+" do concatenation.


    >> Yes, even Fortran has one, and in C, I can simply write
    >> "literal1" "literal2" and they'll be concatenated. It is
    >> only for literals, but still very useful.

    > Concatenation of literal strings could easily be added to
    > the R parser without changing anything else.  (Getting
    > them to deparse as the same two pieces would be tricky,
    > but is maybe not necessary.)

    >    Radford Neal

Thank you,  Bill Dunlap, Radford, Herv'e,  

and others who have explained indirectly that the subject of this thread
is overall rather incomplete or just not true:

Such "improvemnet" -- making something more convenient in many
cases -- would lead to (backward) incompatibilities -- breaking
current functionality -- and inconsistencies in R.

As this thread hopefully comes to a conclusion for now,
let me try mention that this is not the first time the topic has
come up ... and those of us who may still stick around in 5
years, please try to remember :  It will come up every few
years.

Nine years ago was one such occasion --- on this same place,
R-devel ---

Here, I've started it (as Friday afternoon "event", diverting
from the more relevant topic of  S4 methods for "+"),
  https://stat.ethz.ch/pipermail/r-devel/2006-August/038991.html
see also here
  https://stat.ethz.ch/pipermail/r-devel/2006-August/threads.html

or the Gmane archive version of it, e.g. with John Chambers (citing Bill Dunlap)
   http://thread.gmane.org/gmane.comp.lang.r.devel/9331/focus=9347

Also, the arguments (against "+" for string concatenation)
of Thomas Lumley were *not* repeated this time (I think)
   https://stat.ethz.ch/pipermail/r-devel/2006-August/039012.html

---------
Historical note: At the time, there was only 'paste()' in base R.
Gabor did mention  paste0()  as a possible compromise,  
and indeed, we did add paste0() to R eventually.

Maybe we should make this into a new  R-FAQ entry -- so next
time, we can point there instead of re-hashing things ever so often.

I'm volunteering to collect "patches" -- ideally texinfo format,
the latest source of the R FAQ list being
 https://svn.r-project.org/R/trunk/doc/manual/R-FAQ.texi 


Martin Maechler



More information about the R-devel mailing list