[R] gsub - replace multiple occurences with different strings

Gabor Grothendieck ggrothendieck at gmail.com
Tue Oct 6 01:49:42 CEST 2009


Here are two approaching using Bill's sample data:

1. gsubfn supports proto objects whose methods have access to a count
variable that is built into gsubfn and automatically reset to zero at
the start of each string so you can do this (gsubfn uses proto
internally so you don't have to explicitly load it):

> x <- c("xx y e d xx e t f xx e f xx",
+           "xx y e d xx e t f xx",
+           "xx y e d xx e t f xx e f xxxx y e d xx e t f xx e f xx")
>
> library(gsubfn)
> p <- proto(fun = function(this, x) {
+    if (count > 4) x
+    else c("x1", "x2", "x3", "x4")[count]
+ })
> gsubfn("xx", p, x)
[1] "x1 y e d x2 e t f x3 e f x4"
[2] "x1 y e d x2 e t f x3"
[3] "x1 y e d x2 e t f x3 e f x4xx y e d xx e t f xx e f xx"

See the gsubfn vignette for more examples.

2. A simple approach is just to use a for loop:

> X <- x
> for(xn in c("x1", "x2", "x3", "x4")) X <- sub("xx", xn, X)
> X
[1] "x1 y e d x2 e t f x3 e f x4"
[2] "x1 y e d x2 e t f x3"
[3] "x1 y e d x2 e t f x3 e f x4xx y e d xx e t f xx e f xx"
>


On Mon, Oct 5, 2009 at 11:19 AM, William Dunlap <wdunlap at tibco.com> wrote:
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>> [mailto:r-help-bounces at r-project.org] On Behalf Of Martin Batholdy
>> Sent: Monday, October 05, 2009 7:34 AM
>> To: r help
>> Subject: [R] gsub - replace multiple occurences with different strings
>>
>> Hi,
>>
>> I search a way to replace multiple occurrences of a string with
>> different strings
>> depending on the place where it occurs.
>>
>>
>> I tried the following;
>>
>> x <- c("xx y e d xx e t f xx e f xx")
>> x <- gsub("xx", c("x1", "x2", "x3", "x4"), x)
>>
>>
>> what I want to get is;
>>
>> x =
>> x1 y y e d x2 e t f x3 e f x4
>
> You have a doubled y in the output but not the input,
> I'll assume the input is correct.  I extended x to three similar
> strings:
>
>  x <- c("xx y e d xx e t f xx e f xx",
>           "xx y e d xx e t f xx",
>           "xx y e d xx e t f xx e f xxxx y e d xx e t f xx e f xx")
>
> If you know you always have 4 xx's you can use sub (or gsub),
> but it doesn't work properly if there are not exactly 4 xx's:
>  > sub("xx(.*)xx(.*)xx(.*)xx", "x1\\1x2\\2x3\\3x4", x)
>  [1] "x1 y e d x2 e t f x3 e f x4"
>  [2] "xx y e d xx e t f xx"
>  [3] "x1 y e d xx e t f xx e f xxxx y e d x2 e t f x3 e f x4"
>
> You can use gsubfn() from package gsubfn along with a function that
> maintains
> a count of how many times it has been called, as in
>  > gsubfn("xx", local({n<-0;function(x){n<<-n+1;paste(x,n,sep="")}}),
> x)
>  [1] "xx1 y e d xx2 e t f xx3 e f xx4"
>
>  [2] "xx5 y e d xx6 e t f xx7"
>
>  [3] "xx8 y e d xx9 e t f xx10 e f xx11xx12 y e d xx13 e t f xx14 e f
> xx15"
>
> If you want the count to start anew with each string in the vector you
> can use sapply.
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>
>
>>
>>
>> but what I get is;
>>
>> x =
>> x1 y y e d x1 e t f x1 e f x1
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list