[R] regex -> negate a word

Gabor Grothendieck ggrothendieck at gmail.com
Sun Jan 18 22:44:13 CET 2009


Well, that's why it was only provided when you insisted.  This is
not what regexp's are good at.

On Sun, Jan 18, 2009 at 4:35 PM, Rau, Roland <Rau at demogr.mpg.de> wrote:
> Thanks! (I have to admit, though, that I expected something simple)
>
> Thanks,
> Roland
>
>
>
> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
> Sent: Sun 1/18/2009 8:54 PM
> To: Rau, Roland
> Cc: r-help at r-project.org
> Subject: Re: [R] regex -> negate a word
>
> Try this:
>
> grep("^([^a]|a[^b]|ab[^c])*.{0,2}$", x, perl = TRUE)
>
>
> On Sun, Jan 18, 2009 at 2:37 PM, Rau, Roland <Rau at demogr.mpg.de> wrote:
>> Thank you very much to all of you for your fast and excellent help.
>> Since the "-grep(...)" solution seems to be favored by most of the
>> answers,
>> I just wonder if there is really no regular expression which does the
>> job?!?
>>
>> Thanks again,
>> Roland
>>
>>
>>
>> -----Original Message-----
>> From: Gabor Grothendieck [mailto:ggrothendieck at gmail.com]
>> Sent: Sun 1/18/2009 8:28 PM
>> To: Rau, Roland
>> Cc: r-help at r-project.org
>> Subject: Re: [R] regex -> negate a word
>>
>> Try this:
>>
>> # indexes
>> setdiff(seq_along(x), grep("abc", x))
>>
>> # values
>> setdiff(x, grep("abc", x, value = TRUE))
>>
>> Another possibility is:
>>
>> z <- "abc"
>> x0 <- c(x, z) # to handle no match case
>> x0[- grep(z, x0)] # values
>>
>>
>>
>>
>> On Sun, Jan 18, 2009 at 1:35 PM, Rau, Roland <Rau at demogr.mpg.de> wrote:
>>> Dear all,
>>>
>>> let's assume I have a vector of character strings:
>>>
>>> x <- c("abcdef", "defabc", "qwerty")
>>>
>>> What I would like to find is the following: all elements where the word
>>> 'abc' does not appear (i.e. 3 in this case of 'x').
>>>
>>> Since I am not really experienced with regular expressions, I started
>>> slowly and thought I find all word were 'abc' actually does appear:
>>>
>>>> grep(pattern="abc", x=x)
>>> [1] 1 2
>>>
>>> So far, so good. Now I read that ^ is the negation operator. But it can
>>> also denote the beginning of a string as in:
>>>
>>>> grep(pattern="^abc", x=x)
>>> [1] 1
>>>
>>> Of course, we need to put it inside square brackets to negate the
>>> expression [1]
>>>> grep(pattern="[^abc]", x=x)
>>> [1] 1 2 3
>>>
>>> But this is not what I want either.
>>>
>>> I'd appreciate any help. I assume this is rather easy and
>>> straightforward.
>>>
>>> Thanks,
>>> Roland
>>>
>>>
>>> [1] http://www.zytrax.com/tech/web/regex.htm: The ^ (circumflex or
>>> caret) inside square brackets negates the expression....
>>>
>>> ----------
>>> This mail has been sent through the MPI for Demographic Research.  Should
>>> you receive a mail that is apparently from a MPI user without this text
>>> displayed, then the address has most likely been faked. If you are
>>> uncertain
>>> about the validity of this message, please check the mail header or ask
>>> your
>>> system administrator for assistance.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>




More information about the R-help mailing list