[R] Regexp pattern but fixed replacement?

Dave Dixon dd|xon @end|ng |rom @wcp@com
Thu Apr 11 18:57:07 CEST 2024


Backslashes in regex expressions in R are maddening, but they make sense.

R string handling interprets your replacement string "\\" as just one 
backslash. Your string is received by gsub as "\" - that is, just the 
control backslash, NOT the character backslash. gsub is expecting to see 
\0, \1, \2, or some other control starting with backslash.

If you want gsub to replace with a backslash character, you have to send 
it as "\\". In order to get two backslash characters in an R string, you 
have to double them ALL: "\\\\".

The string that is output is an R string: the backslashes are escaped 
with a backslash, so "\\\\" really means two backslashes.

There are lots of special characters in the search string, but only one 
in the replacement string: backslash.

Here's my favorite resource on this topic is 
https://www.regular-expressions.info/replacecharacters.html


On 4/11/24 10:35, Duncan Murdoch wrote:
> I noticed this issue in stringr::str_replace, but it also affects 
> sub() in base R.
>
> If the pattern in a call to one of these needs to be a regular 
> expression, then backslashes in the replacement text are treated 
> specially.
>
> For example,
>
>   gsub("a|b", "\\", "abcdef")
>
> gives "def", not "\\\\def" as I wanted.  To get the latter, I need to 
> escape the replacement backslashes, e.g.
>
>   gsub("a|b", "\\\\", "abcdef")
>
> which gives "\\\\cdef".
>
> I have two questions:
>
> 1.  Is there a variant on sub or str_replace which allows the pattern 
> to be declared as a regular expression, but the replacement to be 
> declared as fixed?
>
> 2.  To get what I want, I can double the backslashes in the 
> replacement text.  This would do that:
>
>    replacement <- gsub("\\\\", "\\\\\\\\", replacement)
>
> Are there any other special characters to worry about besides 
> backslashes?
>
> Duncan Murdoch
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list