[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Tue Jun 9 20:12:20 CEST 2009


Tan, Richard wrote:
> Sorry I did not give some examples in my previous posting to make my
> question clear.  It's not exactly 1 digit, but at least one digit.  Here
> are some examples:
>
>   
>> input = c(none='0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG
>>     
> NONUMBER',all='foob0 fo0o0b 0foob 0foobardo foob4rdoo foobardo0')
>   
>> gsub(x=input, replacement='x', perl=TRUE,pattern=something)
>>     
>
>                                               none
> all 
> "0foo f0oo foo0 foo00 f0o0o foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER"
> "x x x x x x" 
>   

ok, then to my simple mind the following should do:
   
    input = c(
        none='0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER',
        all='foob0 fo0o0b 0foob 0foobardo foob4rdoo foobardo0 123456789')

    gsub('(?=[[:alpha:]]{0,8}[[:digit:]])\\b[[:alnum:]]{5,9}\\b', 'x',
input, perl=TRUE)
    # none -> '0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER',
    # all -> 'x x x x x x x')

where the regex reads 'if there is ahead of you a digit following at
most 8 letters, match 5 to 9 alphanumerics (digits and/or letters). 

vQ




More information about the R-help mailing list