[R] Regular Expressions

Gabor Grothendieck ggrothendieck at myway.com
Tue Jul 13 03:41:30 CEST 2004


Sangick Jeon <sijeon <at> ucdavis.edu> writes:

> Is there a way to use regular expressions to capture two or more words in a 
> sentence?  For example, I wish to to find all the lines that have the 
words "thomas", 
> "perl", and "program", such as "thomas uses a program called perl", or "perl 
is a 
> program that thomas uses", etc.

If you only have two patterns to search for then a regular expression can
be done this way:

   data(state)
   grep("i.*n|n.*i", state.name)  # states with i and n in name

but it gets unwieldy if you have three since there are 6 permutations, not 2.
In that case, you are probably better off iterating greps like this:

   lookfor <- 
   function(pat, x) { for(p in pat) x <- grep(p, x, value = TRUE); x }

   lookfor(c("i","n","g"), state.name)  # states with i, n and g in name




More information about the R-help mailing list