[R] how to match exact phrase using gsub (or similar function)

Justin Haynes jtor14 at gmail.com
Wed Mar 28 22:45:49 CEST 2012


wow!  and here I thought I was starting to know most things about regexes...

On Wed, Mar 28, 2012 at 1:34 PM, William Dunlap <wdunlap at tibco.com> wrote:
> You can use the \< and \> patterns (backslashing the backslashes) to
> mean start and end of "word", respectively.  E.g.,
>
>  > addresses <- c("S S Main St & Interstate 95", "3421 BIGS St")
>  > gsub("\\<S S\\>", "S", addresses)
>  [1] "S Main St & Interstate 95" "3421 BIGS St"
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of Justin Haynes
>> Sent: Wednesday, March 28, 2012 1:24 PM
>> To: Markus Weisner
>> Cc: r-help at r-project.org
>> Subject: Re: [R] how to match exact phrase using gsub (or similar function)
>>
>> In most regexs the carrot( ^ ) signifies the start of a line and the
>> dollar sign ( $ ) signifies the end.
>>
>> gsub('^S S', 'S', a)
>>
>> gsub('^S S', 'S', '3421 BIGS St')
>>
>> you can use logical or inside your pattern too:
>>
>> gsub('^S S|S S$| S S ', 'S', a)
>>
>> the " S S " condition is difficult.
>>
>> gsub('^S S|S S$| S S ', 'S', 'foo S S bar')
>>
>> gives the wrong output. as does:
>>
>> gsub('^S S | S S$| S S ', ' S ', 'foo S S bar')
>> gsub('^S S | S S$| S S ', ' S ', a)
>>
>>
>> so you might have to catch that with a second gsub.
>>
>> gsub(' S S ', ' S ', 'foo S S bar')
>>
>>
>> On Wed, Mar 28, 2012 at 12:32 PM, Markus Weisner <r at themarkus.com> wrote:
>> > trying to switch out addresses that have double directions, such as the
>> > following example:
>> >
>> > a = "S S Main St & Interstate 95"
>> >
>> > a = gsub(pattern="S S ", replacement="S ", a)
>> >
>> >
>> > . the problem is that I don't want to affect instances where this might be
>> > a correct address such as the following:
>> >
>> >
>> > "3421 BIGS St"
>> >
>> >
>> > what I want to say is switch out only if this is either of the following
>> > situations
>> >
>> >
>> > [beginning of char]S S"
>> >
>> > " S S "
>> >
>> > "S S[end of char]
>> >
>> >
>> > Is there anyway of making gsub or a similar function make the replacements
>> > I want?  Thanks in advance for your help.
>> >
>> >
>> > ~Markus
>> >
>> >        [[alternative HTML version deleted]]
>> >
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list