[Rd] [WISH / PATCH] possibility to split string literals across multiple lines

Mark van der Loo mark.vanderloo at gmail.com
Wed Jun 14 14:23:01 CEST 2017


I know it doesn't cause construction at parse time, and it was also not
what I said. What I meant was that it makes the syntax at least look a
little as if you have a line-breaking character within string literals.

Op wo 14 jun. 2017 om 14:18 schreef Joris Meys <jorismeys at gmail.com>:

> Mark, that's actually a fair statement, although your extra operator
> doesn't cause construction at parse time. You still call paste0(), but just
> add an extra layer on top of it.
>
> I also doubt that even in gigantic loops the benefit is going to be
> significant. Take following example:
>
> atestfun <- function(x){
>   y <- paste0("a very long",
>          "string for testing")
>   grep(x, y)
> }
> atestfun2 <- function(x){
>   y <- "a very long
> string for testing"
>   grep(x,y)
> }
> cfun <- cmpfun(atestfun)
> cfun2 <- cmpfun(atestfun2)
>
> require(rbenchmark)
> benchmark(atestfun("a"),
>           atestfun2("a"),
>           cfun("a"),
>           cfun2("a"),
>           replications = 100000)
>
> Which gives after 100,000 replications:
>
>             test replications elapsed relative
> 1  atestfun("a")       100000    0.83    1.339
> 2 atestfun2("a")       100000    0.62    1.000
> 3      cfun("a")       100000    0.81    1.306
> 4     cfun2("a")       100000    0.62    1.000
>
> The patch can in principle make similar code marginally faster, but I'm
> not convinced the patch is going to make any real difference except for in
> some very specific and exotic cases. Even more, calling a function like the
> examples inside the loop is the only way I can come up with where this
> might be a problem. If you just construct the string inside the loop,
> there's two possibilities:
>
> - the string does not need to change, and then you better construct it
> outside of the loop
> - the string does need to change, and then you need paste() or paste0()
> anyway
>
> I'm not against incorporating the patch, as it would eliminate a few
> keystrokes. It's a neat idea, but I don't expect any other noticeable
> advantage from it.
>
> my humble 2 cents
> Cheers
> Joris
>
> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo <
> mark.vanderloo at gmail.com> wrote:
>
>> Having some line-breaking character for string literals would have
>> benefits
>> as string literals can then be constructed parse-time rather than
>> run-time.
>> I have run into this myself a few times as well. One way to at least
>> emulate something like that is the following.
>>
>> `%+%` <- function(x,y) paste0(x,y)
>>
>> "hello" %+%
>>   " pretty" %+%
>>   " world"
>>
>>
>> -Mark
>>
>>
>>
>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting <
>> r-devel at akersting.de>:
>>
>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch <
>> > murdoch.duncan at gmail.com> wrote:
>> >
>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote:
>> > > > Hi,
>> > > >
>> > > > I would really like to have a way to split long string literals
>> across
>> > > > multiple lines in R.
>> > >
>> > > I don't understand why you require the string to be a literal.  Why
>> not
>> > > construct the long string in an expression like
>> > >
>> > >   paste0("aaa",
>> > >          "bbb")
>> > >
>> > > ?  Surely the execution time of the paste0 call is negligible.
>> > >
>> > > Duncan Murdoch
>> >
>> > Actually "execution time" is precisely one of the reasons why I would
>> like
>> > to see this feature as - depending on the context (e.g. in a tight
>> loop) -
>> > the execution time of paste0 (or probably also glue, thanks Gabor) is
>> not
>> > necessarily insignificant.
>> >
>> > The other reason is style: I think it is cleaner if we can construct
>> such
>> > a long string literal without the need for a function call.
>> >
>> > Andreas
>> >
>> > > >
>> > > > Currently, if a string literal spans multiple lines, there is no
>> way to
>> > > > inhibit the introduction of newline characters:
>> > > >
>> > > >  > "aaa
>> > > > + bbb"
>> > > > [1] "aaa\nbbb"
>> > > >
>> > > >
>> > > > If a line ends with a backslash, it is just ignored:
>> > > >
>> > > >  > "aaa\
>> > > > + bbb"
>> > > > [1] "aaa\nbbb"
>> > > >
>> > > >
>> > > > We could use this fact to implement string splitting in a fairly
>> > > > backward-compatible way, since currently such trailing backslashes
>> > > > should hardly be used as they do not have any effect. The attached
>> > patch
>> > > > makes the parser ignore a newline character directly following a
>> > backslash:
>> > > >
>> > > >  > "aaa\
>> > > > + bbb"
>> > > > [1] "aaabbb"
>> > > >
>> > > >
>> > > > I personally would also prefer if leading blanks (spaces and tabs)
>> in
>> > > > the second line are ignored to allow for proper indentation:
>> > > >
>> > > >  >   "aaa \
>> > > > +    bbb"
>> > > > [1] "aaa bbb"
>> > > >
>> > > >  >   "aaa\
>> > > > +    \ bbb"
>> > > > [1] "aaa bbb"
>> > > >
>> > > > This is also implemented by this patch.
>> > > >
>> > > >
>> > > > An alternative approach could be to have something like
>> > > >
>> > > > ("aaa "
>> > > > "bbb")
>> > > >
>> > > > or
>> > > >
>> > > > ("aaa ",
>> > > > "bbb")
>> > > >
>> > > > be interpreted as "aaa bbb".
>> > > >
>> > > > I don't know the ins and outs of the parser of R (hence: please very
>> > > > carefully review the attached patch), but I guess this would be more
>> > > > work to implement!?
>> > > >
>> > > >
>> > > > What do you think? Is there anybody else who is missing this
>> feature in
>> > > > the first place?
>> > > >
>> > > > Regards,
>> > > > Andreas
>> > > >
>> > > >
>> > > >
>> > > > ______________________________________________
>> > > > R-devel at r-project.org mailing list
>> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
>> > > >
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>
>>         [[alternative HTML version deleted]]
>
>
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
>
> tel :  +32 (0)9 264 61 79 <+32%209%20264%2061%2079>
> Joris.Meys at Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list