[R] Replace multiple subexpressions at once?

Bill Dunlap w||||@mwdun|@p @end|ng |rom gm@||@com
Tue Oct 25 19:58:58 CEST 2022


There probably is a package with such a function, but you can do with one
call to sub() if you parenthesize all the subexpressions in the regular
expression and use \\1, etc., in the replacement for those parts you want
to keep.  E.g.,

> s <- "<div data-pos=\"Untitled.knit.md using 12:1-13:1\">"
> want <-  "<div data-pos=\"newname using newnumber:1-13:1\">"
> new_regexp <- "(.*<[^>]+ data-pos=\")([^\"]*)(@)([[:digit:]]+)(:.*)"
> all.equal( sub(new_regexp, "\\1newname\\3newnumber\\5", s), want)
[1] TRUE

-Bill


On Tue, Oct 25, 2022 at 10:12 AM Duncan Murdoch <murdoch.duncan using gmail.com>
wrote:

> An R regular expression pattern can contain parenthesized
> subexpressions, and those can be used later in the same pattern, or in
> the replacement in sub() and gsub().  What I'd like to do is match
> several subexpressions and replace all of them.
>
> For example, I might have a string like
>
>      s <- "<div data-pos=\"Untitled.knit.md using 12:1-13:1\">"
>
> and I can use the regular expression
>
>      regexp <- ".*<[^>]+ data-pos=\"([^\"]*)@([[:digit:]]+):.*"
>
> to match the whole thing, with \\1 matching "Untitled.knit.md" and \\2
> matching "12".  I'd like to replace the first with "newname" and the
> second with "newnumber", so the string ends up looking like
>
>      s <- "<div data-pos=\"newname using newnumber:1-13:1\">"
>
> I could write a function to do this using regexec() and two calls to
> `substring<-`(), but I'm hoping someone has already done that.  Any
> suggestions?
>
> Duncan Murdoch
>
> P.S.  I'm going to be travelling for a couple of weeks, and so may not
> respond right away.
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list