# [R] vectorized sub, gsub, grep, etc.

Thu Oct 9 06:38:01 CEST 2008

```Hello Christos,
To my surprise, vectorization actually hurt processing speed!

#Example
X <- c("ab", "cd", "ef")
patt <- c("b", "cd", "a")
repl <- c("B", "CD", "A")

sub2 <- function(pattern, replacement, x) {
len <- length(x)
if (length(pattern) == 1)
pattern <- rep(pattern, len)
if (length(replacement) == 1)
replacement <- rep(replacement, len)
FUN <- function(i, ...) {
sub(pattern[i], replacement[i], x[i], fixed = TRUE)
}
idx <- 1:length(x)
sapply(idx, FUN)
}

system.time(  for(i in 1:10000)  sub2(patt, repl, X)  )
user  system elapsed
1.18    0.07    1.26

system.time(  for(i in 1:10000)  mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X)  )
user  system elapsed
1.42    0.05    1.47

So much for avoiding loops.

======= At 2008-10-07, 14:58:10 Christos wrote: =======

>John,
>Try the following:
>
> mapply(function(p, r, x) sub(p, r, x, fixed = TRUE), p=patt, r=repl, x=X)
>   b   cd    a
>"aB" "CD" "ef"
>
>-Christos

>> -----My Original Message-----
>> R pattern-matching and replacement functions are
>> vectorized: they can operate on vectors of targets.
>> However, they can only use one pattern and replacement.
>> Here is code to apply a different pattern and replacement for
>> every target.  My question: can it be done better?
>>
>> sub2 <- function(pattern, replacement, x) {
>>     len <- length(x)
>>     if (length(pattern) == 1)
>>         pattern <- rep(pattern, len)
>>     if (length(replacement) == 1)
>>         replacement <- rep(replacement, len)
>>     FUN <- function(i, ...) {
>>         sub(pattern[i], replacement[i], x[i], fixed = TRUE)
>>     }
>>     idx <- 1:length(x)
>>     sapply(idx, FUN)
>> }
>>
>> #Example
>> X <- c("ab", "cd", "ef")
>> patt <- c("b", "cd", "a")
>> repl <- c("B", "CD", "A")
>> sub2(patt, repl, X)
>>
>> -John

```