[R] regular expression help

C Lin baccts at hotmail.com
Fri Jun 27 15:27:58 CEST 2014


Thank you all for your help.

Bill, thanks for making it compact and I did mean any amount of whitespace.

To break it down, so I know why this pattern work:
The first parenthesis means that before AARSD1 it can be
^: begins with nothing
|: or
//: double slash or
[[:space:]]+: one or more whitespace character

For the second parenthesis:
$: ending with nothing

Do this sound correct?

I missed the fact that I need the ^ and $ and I always do [:space:]+ instead of [[:space:]]+
what's the difference between [:space:]+ and[[:space:]]+ ?

Thanks so much!
Lin

----------------------------------------
> From: wdunlap at tibco.com
> Date: Fri, 27 Jun 2014 02:35:54 -0700
> Subject: Re: [R] regular expression help
> To: dwinsemius at comcast.net
> CC: baccts at hotmail.com; r-help at r-project.org
>
> You can use parentheses to factor out the common string in David's
> pattern, as in
> grep(value=TRUE, "(^|//|[[:space:]]+)AARSD1($|//|[[:space:]]+)", test)
>
> (By 'whitespace' I could not tell if you meant any amount of
> whitespace or a single
> whitespace character. I use '+' to match one or more whitespace characters.)
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Thu, Jun 26, 2014 at 10:12 PM, David Winsemius
> <dwinsemius at comcast.net> wrote:
>>
>> On Jun 26, 2014, at 6:11 PM, C Lin wrote:
>>
>>> Hi Duncan,
>>>
>>> Thanks for trying to help. Sorry for not being clear.
>>> The string I'd like to get is 'AARSD1'
>>> It can be followed or preceded by white space or // or nothing
>>>
>>> so, from test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 //','//AARSD1','AARSD1');
>>>
>>> I want to match only 'AARSD1//','AARSD1 //','//AARSD1','AARSD1'
>>
>> Perhaps you want jsut
>>
>> grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1', test)
>>
>>> grepl('^AARSD1//$|^AARSD1 //$|^//AARSD1$|^AARSD1$', test)
>> [1] FALSE FALSE TRUE TRUE TRUE TRUE
>>
>> --
>> David.
>>
>>>
>>
>>> Thanks,
>>> Lin
>>>
>>> ----------------------------------------
>>>> From: dulcalma at bigpond.com
>>>> To: baccts at hotmail.com; r-help at r-project.org
>>>> Subject: RE: [R] regular expression help
>>>> Date: Fri, 27 Jun 2014 10:59:29 +1000
>>>>
>>>> Hi
>>>>
>>>> You only have a vector of length 5 and I am not quite sure of the string you
>>>> are testing
>>>> so try this
>>>>
>>>> grep('[/]*\\<AARSD1\\>[/]*',test)
>>>>
>>>> Duncan
>>>>
>>>> Duncan Mackay
>>>> Department of Agronomy and Soil Science
>>>> University of New England
>>>> Armidale NSW 2351
>>>> Email: home: mackay at northnet.com.au
>>>>
>>>> -----Original Message-----
>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
>>>> Behalf Of C Lin
>>>> Sent: Friday, 27 June 2014 10:05
>>>> To: r-help at r-project.org
>>>> Subject: [R] regular expression help
>>>>
>>>> Dear R users,
>>>>
>>>> I need to match a string. It can be followed or preceded by whitespace or //
>>>> or nothing.
>>>> How do I code it in R?
>>>>
>>>> For example:
>>>> test <- c('AARSD11','AARSD1-','AARSD1//','AARSD1 //','//AARSD1');
>>>> grep('AARSD1(\\s*//*)',test);
>>>>
>>>> should return 3,4,5 and 6.
>>>>
>>>
>>
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
 		 	   		  


More information about the R-help mailing list