[R] vectorized sub, gsub, grep, etc.

Adam Erickson adam.michael.erickson at gmail.com
Mon Aug 3 10:30:22 CEST 2015


Interesting. I know of no practical use for such a function. If the first
position were 'abb,' sub() would return 'aBb,' failing to replace the
second 'b.' I find it hard to believe that's the desired functionality.
Writing a looped regex function in Rcpp makes the most sense for speed.
Using Boost C++ library regex (link
<http://gallery.rcpp.org/articles/boost-regular-expressions/>) or a C++
wrapper for PCRE (link <https://gist.github.com/abicky/58ea79b01d9e394d5076>)
are two solutions, but pure Rcpp would be ideal to avoid external software
dependencies.

Cheers,

Adam

On Sun, Aug 2, 2015 at 9:42 PM, John Thaden <jjthaden at flash.net> wrote:

> Adam,
>
> The original posting gave a function sub2 whose aim differs both from your
> functions' aim and from the intent of mgsub() in the qdap package:
>
> > Here is code to apply a different
> > pattern and replacement for every target.
>
>     #Example
>     X <- c("ab", "cd", "ef")
>     patt <- c("b", "cd", "a")
>     repl <- c("B", "CD", "A")
>
> The first pattern ('b') and the first replacement ('B') therefore apply
> only to the first target ('ab'), the second to the second, etc. The
> function achieves its aim, giving the correct answer 'aB', 'CD', 'ef'.
>
> mgsub() satisfies a different need, testing all targets for matches with
> any pattern in the vector of patterns and, if a match is found, replacing
> the matched target with the replacement value corresponding to the matched
> pattern. It, too, achieves its aim, giving a different (but also correct)
> answer 'AB', 'CD', 'ef'.
>
> Regards,
> -John
>
>     #Example
>     X <- c("ab", "cd", "ef")
>     patt <- c("b", "cd", "a")
>     repl <- c("B", "CD", "A")
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list