[R] Grep command

William Dunlap wdunlap at tibco.com
Wed May 4 19:23:01 CEST 2016


No matter how expert you are at writing regular expressions,
it is important to list which sorts of strings you want matched
and which you do not want matched.  Saying you want to match
"age" but not "age2" leads to lots of possibilities.  Saying how
you want to categorize each string in a vector of stirngs like
the following would narrow things down.
   c("age", "ages ago", "age 60", "An aged man", "page", "Age", "age1",
      "age2",  "dark age", "the aGE")
>From such a list, make a good verbal description of the rule you
are thinking of and someone will be able to translate that into a regular
expression (or say that regular expressions cannot do the job).


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, May 4, 2016 at 9:59 AM, David Winsemius <dwinsemius at comcast.net>
wrote:

>
> > On May 3, 2016, at 11:16 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
> >
> > Yes, but the answer is likely to depend on the actual patterns of
> strings in your real data, so the sooner you go find a book or tutorial on
> regular expressions the better.  This is decidedly not R specific and there
> are already lots of resources out there.
> >
> > Given the example you provide,  the pattern "age$" should work. However,
> that is probably not sufficiently selective for a practical data set so
> start learning to fish (design regex patterns) yourself.
>
> @ Steven;
>
> As is almost always the case I agree with Jeff. I found that reading Rhelp
> and attempting to answer regex-questions was the best method to learn them.
> In particular I found the postings by Gabor Grothendieck very helpful in
> getting some degree of competence in this area. I see that his grep-related
> postings still exceed my grep postings and I assure you that his will be
> more sophisticated than my efforts. I recommend the MarkMail Rhelp mirror
> interface as very useful in "mining" Rhelp for knowledge:
>
> Gabor Grothendieck answers with either 'grep' pr 'regex' in their body:
>
>
> http://markmail.org/search/?q=list%3Aorg.r-project.r-help+list%3Agrep+list%3Aregex+from%3A%22Gabor+Grothendieck
>
> --
> Happy searching;
> David.
>
>
> > --
> > Sent from my phone. Please excuse my brevity.
> >
> > On May 3, 2016 10:45:42 PM PDT, Steven Yen <syen04 at gmail.com> wrote:
> >> Dear all
> >> In the grep command below, is there a way to identify only "age" and
> >> not "age2"? In other words, I like to greb "age" and "age2"
> >> separately, one at a time. Thanks.
> >>
> >> x<-c("abc","def","rst","xyz","age","age2")
> >> x
> >>
> >> [1] "abc"  "def"  "rst"  "xyz"  "age"  "age2"
> >>
> >> grep("age2",x)
> >>
> >> [1] 6
> >>
> >> grep("age",x) # I need to grab "age" only, not "age2"
> >>
> >> [1] 5 6
> >>
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list