[R] Regular expressions in R

Sarah Goslee sarah.goslee at gmail.com
Tue Nov 15 18:30:06 CET 2011


Hi Michael,

You need to take another look at the examples you were given, and at
the help for ?sub():

     The two ‘*sub’ functions differ only in that ‘sub’ replaces only
     the first occurrence of a ‘pattern’ whereas ‘gsub’ replaces all
     occurrences.  If ‘replacement’ contains backreferences which are
     not defined in ‘pattern’ the result is undefined (but most often
     the backreference is taken to be ‘""’).

Sarah

On Tue, Nov 15, 2011 at 12:18 PM, Michael Griffiths
<griffiths at upstreamsystems.com> wrote:
> Good afternoon list,
>
> I have the following character strings; one with spaces between the maths
> operators and variable names, and one without said spaces.
>
> form<-c('~ Sentence + LEGAL + Intro + Intro / Intro1 + Intro * LEGAL +
> benefit + benefit / benefit1 + product + action * mean + CTA + help + mean
> * product')
> form<-c('~Sentence+LEGAL+Intro+Intro/Intro1+Intro*LEGAL+benefit+benefit/benefit1+product+action*mean+CTA+help+mean*product')
>
> I would like to remove the following target strings, either:
>
> 1. '+ Intro * LEGAL' which is  '+ space name space * space name'
> 2. '+Intro*LEGAL' which is  '+ nospace name nospace * nospace name'
>
> Having delved into a variety of sites (e.g.
> http://www.zytrax.com/tech/web/regex.htm#search) investigating regular
> expressions I now have a basic grasp, but I am having difficulties removing
> ALL of the instances or 1. or 2.
>
> The code below removes just a SINGLE instance of the target string, but I
> was expecting it to remove all instances as I have \\*.[[allnum]]. I did
> try \\*.[[allnum]]*, but this did not work.
>
> form<-sub("\\+*\\s*[[:alnum:]]*\\s*\\*.[[:alnum:]]", "", form)
>
> I am obviously still not understanding something. If the list could offer
> some guidance I would be most grateful.
>
> Regards
>
> Mike Griffiths
>
>
>
-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list