[Rd] Improving string concatenation

Gábor Csárdi csardi.gabor at gmail.com
Wed Jun 17 02:13:24 CEST 2015


On Tue, Jun 16, 2015 at 6:32 PM, Joshua Bradley <jgbradley1 at gmail.com> wrote:
[...]
> An old (2005) post
> <https://stat.ethz.ch/pipermail/r-help/2005-February/066709.html> on r-help
> mentioned possible performance reasons as to why this type of string
> concatenation is not supported out of the box but did not go into detail.
> Can someone explain why such a basic task as this must be handled by
> paste() instead of just using the '+' operator directly?

Well, R-core's reason was in that email thread, quoting:

"The issue is that only coercion between numeric
(broad sense, including complex) types is supported for the arithmetical
operators, presumably to avoid the ambiguity of things like

x <- 123.45
y <- as.character(1)
x + y

Should that be 124.45 or "123.451"?  One of the difficulties of any
dispatch on two arguments is how to do the best matching on two classes,
especially with symmetric operators like "+".  Internally R favours simple
fast rules."

Personally, I am not really convinced by this, because what currently
happens is this:

1 + "1"
#> Error in 1 + "1" : non-numeric argument to binary operator
"1" + 1
#> Error in "1" + 1 : non-numeric argument to binary operator

which is perfectly fine behavior, and it could stay the same with a
'+' string concatenation operator, i.e.:
- if both arguments are characters, call paste(),
- otherwise go on and do whatever is being done right now.
In other words, coercion to string is not important in the '+' operator.

> Would performance
> degrade much today if the '+' form of string concatenation were added into
> R by default?

Personally, I highly doubt it, but I don't have a benchmark to back this up.

Gabor

[...]



More information about the R-devel mailing list