[R] Question about PERL lookahead construct in regex's

Greg Snow 538280 @end|ng |rom gm@||@com
Tue Aug 11 21:23:21 CEST 2020


I think that the current documentation is correct, but that does not
mean that it cannot be improved.

The key phrase for me is "from the current position"  which says to me
that the match needs to happen right there, not just somewhere in the
rest of the string.

If you used the expression " +t" then you would expect it to only
match if the t was immediately after the last space, not somewhere in
the string after the last space, it is the same with the look-ahead.

On Mon, Aug 10, 2020 at 10:37 AM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> Folks:
>
> Consider:
> > y <- "xx wt"
>
> > grep(" +(?=t)",y, perl = TRUE)
> integer(0)
> ## Unexpected. Lookahead construct does not find "t" after space
> ## But
> > grep(" +(?=.+t)",y, perl = TRUE)
> [1] 1
> ## Expected. Given pattern for **exact** match, lookahead finds it
>
> My concern is:
> ?regexp says this:
> "Patterns (?=...) and (?!...) are zero-width positive and negative lookahead
>  *assertions*: they match if an attempt to match the ... forward from the
> current position would succeed (or not), but use up no characters in the
> string being processed."
>
> But this appears to be imprecise (it confused me, anyway). The usual sense
> of "matching" in regex's is "match the pattern somewhere in the string
> going forward." But in the perl lookahead construct it apparently must
> **exactly** match *everything* in the string that follows.
>
> Questions:
> Am I correct about this? If not, what do I misunderstand?
> If I am correct, should the regex help be slightly modified to something
> like:
>
> "Patterns (?=...) and (?!...) are zero-width positive and negative lookahead
>  *assertions*: they match if an attempt to **exactly" match all of ... forward
> from the current position would succeed (or not), but use up no characters
> in the string being processed."
>
> Thanks.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538280 using gmail.com



More information about the R-help mailing list