[R] using regular expressions to retrieve a digit-digit-dot structure from a string

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Tue Jun 9 01:18:32 CEST 2009


Gabor Grothendieck wrote:
> Try this.  See ?regex for more.
>
>   
>> x <- 'This happened in the 21. century." (the dot behind 21 is'
>> regexpr("(?![0-9]+)[.]", x, perl = TRUE)
>>     
> [1] 24
> attr(,"match.length")
> [1] 1
>   

yes, but

    gregexpr('(?![0-9]+)[.]', 'a. 1. a1.', perl=TRUE)
    # 2 5 9

which, i guess, is not what you want.  if what you want is to match all
and only dots that follow at least one digit preceded by a word
boundary, then the following should do, as far as i can see:

    gregexpr('\\b[0-9]+\\K[.]', 'a. 1. a1.', perl=TRUE)
    # 5

vQ




More information about the R-help mailing list