[Rd] "+" for character method...

Duncan Murdoch murdoch at stats.uwo.ca
Fri Aug 25 19:18:42 CEST 2006


On 8/25/2006 12:31 PM, Martin Maechler wrote:
> This thread remains me of an old recurring (last May!) theme 
> which maybe fits well to Friday late afternoon...
> 
> There have been propositions to make  "+"  work  in S (and R)
> like in some other languages,
> namely for character (vectors),
>  
>      a + b :=  paste(a,b, sep="")
> 
> IIRC, when this theme came up last, the one argument against it
> was the penalty of method dispatch that we were not willing to
> pay for something as fundamentally speed-important as "+" --
> which is a .Primitive in R exactly for that reason of efficiency.
> 
> But then, we actually do dispatch for "+" -- internally in C
> code via DispatchGroup()  --- but only if we need, so not when usual
> numeric/complex arguments are used.
> 
> I think - but may be wrong - it should be possible to also check
> very fast for two "character" arguments and in that case do a fast
> version of  paste(a, b,  sep="").

But for consistency shouldn't this work if only one of the args is 
character, coercing the other to character?  E.g. we have

 > "2" > 10
[1] TRUE

> When this last came up (in May), Brian said that about the fact
> that you could not just simply define "+.character"
> 
>>> I would think that the intention was also to positively discourage messing 
>>> with the basics of R, as if you were able to do this erroneous uses would 
>>> likely not get caught.
>    ( https://stat.ethz.ch/pipermail/r-help/2006-May/104751.html )
> and subsequently (https://stat.ethz.ch/pipermail/r-help/2006-May/104754.html)
> gave an example for this
> 
>>> 2 + x, for example, where x is not numeric.

This is a valid concern, but I think the clarity obtained by coding 
paste operations using + is worth it.

For example, the first instance of paste(a, b,  sep="") I see in the 
source is

is.ALL(structure(1:7, names = paste("a",1:7,sep="")))

in base/demo/is.things.R

which I find clearer as

is.ALL(structure(1:7, names = "a" + 1:7))

But then I'm used to using + for strings from Borland's Pascal 
extensions; to a C-speaker the meaning may not be so obvious.

Duncan Murdoch


> I wonder however, if we do this in C, and basically only go into
> the paste-branch when both arguments are characters,
> if we wouldn't get to a nice useful solution without a noticable
> performance penalty.
> 
> This would also solve my other slight related uneasyness :
> Many times in the past, when using paste(..., sep='')
> in function definitions I had wanted this (empty sep) to be the
> default and to have an easier, more readable way to achieve the
> same.
> 
> But then these all are just musings at the end of the week...
> 
> Martin Maechler, ETH Zurich
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list