[R] Regex question to find a string that contains 5-9 alpha-numeric characters, at least one of which is a number

Tan, Richard RTan at panagora.com
Tue Jun 9 19:20:03 CEST 2009


Sorry I did not give some examples in my previous posting to make my
question clear.  It's not exactly 1 digit, but at least one digit.  Here
are some examples:

> input = c(none='0foo f0oo foo0 foofoofoo0 0foofoofoo TOOLOOOO9NGG
NONUMBER',all='foob0 fo0o0b 0foob 0foobardo foob4rdoo foobardo0')
> gsub(x=input, replacement='x', perl=TRUE,pattern=something)

                                              none
all 
"0foo f0oo foo0 foo00 f0o0o foofoofoo0 0foofoofoo TOOLOOOO9NGG NONUMBER"
"x x x x x x" 

-----Original Message-----
From: Wacek Kusnierczyk [mailto:Waclaw.Marcin.Kusnierczyk at idi.ntnu.no] 
Sent: Tuesday, June 09, 2009 1:06 PM
To: Greg Snow
Cc: Marc Schwartz; Barry Rowlingson; r-help at r-project.org; Tan, Richard
Subject: Re: [R] Regex question to find a string that contains 5-9
alpha-numeric characters, at least one of which is a number

Greg Snow wrote:
> Here is one way using a single pattern (so can be used in a
substitution), it uses Perl's positive look ahead patters:
>
>   
>> test <- 
>> c("SHRT","5HRT","M1TCH","M1TCH5","LONG3RS","NONUMBER","TOOLOOOONGG","
>> ooops.3")
>>
>> sub( '(?=[a-zA-Z]{0,8}[0-9])[a-zA-Z0-9]{5,9}', 'xxx', test, 
>> perl=TRUE)
>>     


yes, but:

    sub( '(?=[a-zA-Z]{0,8}[0-9])[a-zA-Z0-9]{5,9}', 'xxxxx', '12345',
perl=TRUE)
    # "xxxxx"

which is not what was expected -- as far as i understand, the point was
to match 5-9 character strings with exactly 1 digit.

vQ




More information about the R-help mailing list