[Rd] regexec() bug in R 3.4.0

Martin Maechler maechler at stat.math.ethz.ch
Thu Jun 29 12:32:54 CEST 2017

>>>>> Weeks, Nathan <Nathan.Weeks at ARS.USDA.GOV>
>>>>>     on Wed, 28 Jun 2017 17:11:01 +0000 writes:

> Hi,
> In R 3.4.0, the "Pattern Matching and Replacement" documentation that describes regexec(), gregexpr(), etc. states that the "text" argument to regexec is a character vector, "or an object which can be coerced by as.character to a character vector":
>      regexec(pattern, text, ignore.case = FALSE, perl = FALSE,
>              fixed = FALSE, useBytes = FALSE)
>      x, text: a character vector where matches are sought, or an object
>          which can be coerced by as.character to a character vector.
>          Long vectors are supported.
> However, in R 3.4.0, this coercion doesn't seem to automatically occur for the text argument of regexec(), whereas it does for gregexpr(), regexpr(), etc:
> ============================================================
> $ R --vanilla
> R version 3.4.0 (2017-04-21) -- "You Stupid Darkness"
> Copyright (C) 2017 The R Foundation for Statistical Computing
> Platform: x86_64-pc-linux-gnu (64-bit)
> ...
> > text <- as.factor("foobar")
> > regexec("foo", text)
> Error in regexec("foo", text) : invalid 'text' argument


I agree this is an inconsistency of documentation and behaviour,
and hence an (easy to work around) bug.

I propose to fix the code (for consistency) rather than the
documentation and will do so if there's no dissent.

We have become wary and cautious with last minute changes so
this won't be in  R 3.4.1 (due tomorrow Friday) but probably
in 'R 3.4.1 patched" later, and then future versions.

Martin Maechler,
ETH Zurich

More information about the R-devel mailing list