[R] Regular expressions: offsets of groups

Gabor Grothendieck ggrothendieck at gmail.com
Mon Sep 27 20:10:18 CEST 2010


On Mon, Sep 27, 2010 at 1:34 PM, Titus von der Malsburg
<malsburg at gmail.com> wrote:
> On Mon, Sep 27, 2010 at 7:29 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>> Try this zero width negative look behind expression:
>>
>>> gregexpr("(?!a+)(b+)", "abcdaabbc", perl = TRUE)
>> [[1]]
>> [1] 2 7
>> attr(,"match.length")
>> [1] 1 2
>
> Thanks Gabor, but this gives me the same result as
>
>  gregexpr("b+", "abcdaabbc", perl = TRUE)
>
> which is wrong if the string is "abcdaabbcbbb".
>

Sorry, try this:

>  gregexpr("(?<=a)b+", "abcdaabbcbbb", perl = TRUE)
[[1]]
[1] 2 7
attr(,"match.length")
[1] 1 2

Note that it does not give the same answer as:

>  gregexpr("b+", "abcdaabbcbbb", perl = TRUE)
[[1]]
[1]  2  7 10
attr(,"match.length")
[1] 1 2 3


 gregexpr("(?<=a)b+", "abcdaabbcbbb", perl = TRUE)




-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list