[R] Regex Split?

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri May 5 18:30:33 CEST 2023


>>>>> Bill Dunlap on Fri, 5 May 2023 08:19:21 -0700 writes:

    https://bugs.r-project.org/show_bug.cgi?id=16745 (from 2016, still labelled
     'UNCONFIRMED") contains some other examples of strsplit misbehaving when
    using 0-length perl look-behinds.  E.g.,

Thank you, Bill -- yes, uhmm, ... a bit embarrassing.

I've finally changed the R bugzilla report's state to "CONFIRMED" now,
and also added the "HELPWANTED" keyword.
I think we (R Core) should be sorry to have (forgotten / not
cared about) the issue completely.

It's not hard to at least agree that the current behavior is buggy,
e.g., in the example you show here :

    >> strsplit(split="[[:<:]]", "One, two; three!", perl=TRUE)[[1]]
    > [1] "O"  "n"  "e"  ", " "t"  "w"  "o"  "; " "t"  "h"  "r"  "e"  "e"  "!"
    >> gsub(pattern="[[:<:]]", "#", "One, two; three!", perl=TRUE)
    > [1] "#One, #two; #three!"

[...]
[...]

Maybe this should be continued either on Bugzilla (i.e., the URL above),
or if needed, additionally  on R-devel.

Yes, I also added that we'd grateful for (tested) patches and/or
code reviewers.

Martin


--
Martin Maechler
ETH Zurich  and  R Core team



More information about the R-help mailing list