[R] Question about PERL lookahead construct in regex's

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Mon Aug 10 18:36:49 CEST 2020


Folks:

Consider:
> y <- "xx wt"

> grep(" +(?=t)",y, perl = TRUE)
integer(0)
## Unexpected. Lookahead construct does not find "t" after space
## But
> grep(" +(?=.+t)",y, perl = TRUE)
[1] 1
## Expected. Given pattern for **exact** match, lookahead finds it

My concern is:
?regexp says this:
"Patterns (?=...) and (?!...) are zero-width positive and negative lookahead
 *assertions*: they match if an attempt to match the ... forward from the
current position would succeed (or not), but use up no characters in the
string being processed."

But this appears to be imprecise (it confused me, anyway). The usual sense
of "matching" in regex's is "match the pattern somewhere in the string
going forward." But in the perl lookahead construct it apparently must
**exactly** match *everything* in the string that follows.

Questions:
Am I correct about this? If not, what do I misunderstand?
If I am correct, should the regex help be slightly modified to something
like:

"Patterns (?=...) and (?!...) are zero-width positive and negative lookahead
 *assertions*: they match if an attempt to **exactly" match all of ... forward
from the current position would succeed (or not), but use up no characters
in the string being processed."

Thanks.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

	[[alternative HTML version deleted]]



More information about the R-help mailing list