[R] grep for strings

Petr Savicky savicky at cs.cas.cz
Sun Dec 5 08:51:54 CET 2010


On Sun, Dec 05, 2010 at 08:04:08AM +0530, Santosh Srinivas wrote:
> I am trying to find the function where I can search for a pattern in a
> text string (I thought I could use grep for this but no :().
> 
> > x
> [1] "abcdefghijkl"
> 
> I want to find the positions (i.e. equivalent of nchar) for "cd" and
> in case there are multiple hits .. then the results as a array

For a single string, for example

  p <- gregexpr("cd", "abcdecdecdcd")[[1]]

  p
  [1]  3  6  9 11
  attr(,"match.length")
  [1] 2 2 2 2

  as.numeric(p) # [1]  3  6  9 11

For a vector of strings, for example

  p <-  gregexpr("cd", c("abcde", "acdecde", "abcdecdecdcd", "cdcd"))
  m <- max(unlist(lapply(p, length)))
  out <- matrix(nrow=length(p), ncol=m)
  for (i in seq(nrow(out))) {
      out[i, seq(length(p[[i]]))] <- p[[i]]
  }

  out
       [,1] [,2] [,3] [,4]
  [1,]    3   NA   NA   NA
  [2,]    2    5   NA   NA
  [3,]    3    6    9   11
  [4,]    1    3   NA   NA

Petr Savicky.



More information about the R-help mailing list