[R] gsub: replacing double backslashes with single backslash

David Winsemius dwinsemius at comcast.net
Wed Mar 7 15:57:02 CET 2012

On Mar 7, 2012, at 6:54 AM, Markus Elze wrote:

> Hello everybody,
> this might be a trivial question, but I have been unable to find  
> this using Google. I am trying to replace double backslashes with  
> single backslashes using gsub.

Actually you don't have double backslashes in the argument you are  
presenting to gsub. The string entered at the console as "C:\\" only  
has a single backslash.

 > nchar("C:\\")
[1] 3

> There seems to be some unexpected behaviour with regards to the  
> replacement string "\\". The following example uses the string C:\\  
> which should be converted to C:\ .
> > gsub("\\\\", "\\", "C:\\")
> [1] "C:"

But I do not understand that returned value, either. I thought that  
the 'repl' argument (which I think I have demonstrated is a single  
backslash) would get put back in the returned value.

> > gsub("\\\\", "Test", "C:\\")
> [1] "C:Test"
> > gsub("\\\\", "\\\\", "C:\\")
> [1] "C:\\"

I thought the parsing rules for 'replacement' were different than the  
rules for 'patt'. So I'm puzzled, too. Maybe something changed in 2.14?

 > sub("\\\\", "\\", "C:\\", fixed=TRUE)
[1] "C:\\"
 > sub("\\\\", "\\", "C:\\")
[1] "C:"
 > sub("([\\])", "\\1", "C:\\")
[1] "C:\\"

The NEWS file does say that there is a new regular expression  
implementation and that the help file for regex should be consulted.

And presumably we should study this:


  In the 'replacement' argument, the "\\" is used to back-reference a  
numbered sub-pattern, so perhaps "\\" is now getting handled as the  
"null subpattern"? I don't see that mentioned in the regex help page,  
but it is a big "page". I also didn't see "\\" referenced in the TRE  
documentation, but then again I don't think that "\\" in console or  
source() input is a double backslash. The TRE document says that "A \  
cannot be the last character of an ERE." I cannot tell whether that  
rule gets applied to the 'replacement'.

> I have observed similar behaviour for fixed=TRUE and perl=TRUE. I  
> use R 2.14.1 64-bit on Windows 7.

David Winsemius, MD
West Hartford, CT

More information about the R-help mailing list