[R] Using grep() to subset lines of text

Gabor Grothendieck ggrothendieck at gmail.com
Sun Nov 30 05:40:38 CET 2008


Try this. For each character x in s,  if "x" is punctuation it is replaced
with "\\x" otherwise with "[x]" :

library(gsubfn)
gsubfn('.', ~ if (any(grep("[[:punct:]]", x))) paste0('\\', x) else
paste0('[', x, ']'), s)

See http://gsubfn.googlecode.com


On Sat, Nov 29, 2008 at 10:09 PM, Stavros Macrakis
<macrakis at alum.mit.edu> wrote:
> But I don't want to ignore all regexp's -- I want to build a regexp which
> contains string components which are parameters.
>
>             -s
>
> On Sat, Nov 29, 2008 at 6:51 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>>
>> grep has a fixed = TRUE argument if you want to ignore all regexp's.
>>
>> On Sat, Nov 29, 2008 at 3:55 PM, Stavros Macrakis <macrakis at alum.mit.edu>
>> wrote:
>> > Hmm, this brings up an interesting question.  What if the string I'm
>> > looking
>> > for contains escape characters?  For example, grep( paste( "^", "(ab)"
>> > ),
>> > c("ab","(ab)") ) => c(1), not c(2).
>> >
>> > I couldn't find an equivalent to Emacs's regexp-quote, which would let
>> > me
>> > write regexp.quote("(ab)") => "\\(ab\\)".  The syntax of regular
>> > expressions
>> > is complicated enough that this is not trivial. Is there perhaps a CRAN
>> > package with regular expression utilities?
>> >
>> >             -s
>> >
>> > On Sat, Nov 29, 2008 at 7:12 AM, Gabor Grothendieck
>> > <ggrothendieck at gmail.com> wrote:
>> >>
>> >> > a <- 2:3
>> >> > b <- c("aaa 2 aaa", "2 aaa", "3 aaa", "aaa 3 aaa")
>> >> > re <- paste("^(", paste(a, collapse = "|"), ")", sep = "")
>> >> > grep(re, b, value = TRUE)
>> >> [1] "2 aaa" "3 aaa"
>> >
>> >
>
>



More information about the R-help mailing list