[R] regular expressions with grep() and negative indexing

jim holtman jholtman at gmail.com
Wed Apr 25 13:19:24 CEST 2007


Find the ones that match and then remove them from the full set with 'setdiff'.

> x <- c("seal.0","seal.1-exclude")
> x.match <- grep("exclude", x)  # find matches
> x.match
[1] 2
> setdiff(seq_along(x), x.match)  # exclude the matches
[1] 1
>


On 4/25/07, Peter Dalgaard <P.Dalgaard at biostat.ku.dk> wrote:
> Stephen Tucker wrote:
> > Dear R-helpers,
> >
> > Does anyone know how to use regular expressions to return vector elements
> > that don't contain a word? For instance, if I have a vector
> >   x <- c("seal.0","seal.1-exclude")
> > I'd like to get back the elements which do not contain the word "exclude",
> > using something like (I know this doesn't work) but:
> >   grep("[^(exclude)]",x)
> >
> > I can use
> >   x[-grep("exclude",x)]
> > for this case but then if I use this expression in a recursive function, it
> > will not work for instances in which the vector contains no elements with
> > that word. For instance, if I have
> >   x2 <- c("dolphin.0","dolphin.1")
> > then
> >   x2[-grep("exclude",x2)]
> > will give me 'character(0)'
> >
> > I know I can accomplish this in several steps, for instance:
> >   myfunc <- function(x) {
> >     iexclude <- grep("exclude",x)
> >     if(length(iexclude) > 0) x2 <- x[-iexclude] else x2 <- x
> >     # do stuff with x2 <...?
> >   }
> >
> > But this is embedded in a much larger function and I am trying to minimize
> > intermediate variable assignment (perhaps a futile effort). But if anyone
> > knows of an easy solution, I'd appreciate a tip.
> >
> It has come up a couple of times before, and yes, it is a bit of a pain.
>
> Probably the quickest way out is
>
> negIndex <- function(i)
>
>   if(length(i))
>
>       -i
>
>   else
>
>       TRUE
>
> --
>   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list