[R] gsub: replacing double backslashes with single backslash

Ista Zahn istazahn at gmail.com
Thu Mar 8 03:55:03 CET 2012


On Wed, Mar 7, 2012 at 12:57 PM, Greg Snow <538280 at gmail.com> wrote:
>
> The issue here is the difference between what is contained in a string
> and what R displays to you.
>
> The string produced with the code:
>
> > tmp <- "C:\\"
>
> only has 3 characters (as David pointed out), the third of which is a
> single backslash, since the 1st \ escapes the 2nd and the R string
> parsing rules use the combination to put a sing backslash in the
> string.  When you print the string (whether you call print directly or
> indirectly) the print function escapes special characters, including
> the backslash, so you see "\\" which represents a single backslash in
> the string.  If you use the cat function instead of the print
> function, then you will only see a single backslash (and other escape
> sequences such as \n will also display different in print vs. cat
> output).  There are other ways to see the exact string (write to a
> file, use in certain command, etc.) but cat is probably the simplest.


Fine, but how does this help the OP (and me!) figure out how to
replace "C:\\" with "C:\" ?

Best,
Ista
>
>
> On Wed, Mar 7, 2012 at 7:57 AM, David Winsemius <dwinsemius at comcast.net> wrote:
> >
> > On Mar 7, 2012, at 6:54 AM, Markus Elze wrote:
> >
> >> Hello everybody,
> >> this might be a trivial question, but I have been unable to find this
> >> using Google. I am trying to replace double backslashes with single
> >> backslashes using gsub.
> >
> >
> > Actually you don't have double backslashes in the argument you are
> > presenting to gsub. The string entered at the console as "C:\\" only has a
> > single backslash.
> >
> >> nchar("C:\\")
> > [1] 3
> >
> >
> >> There seems to be some unexpected behaviour with regards to the
> >> replacement string "\\". The following example uses the string C:\\ which
> >> should be converted to C:\ .
> >>
> >> > gsub("\\\\", "\\", "C:\\")
> >> [1] "C:"
> >
> >
> > But I do not understand that returned value, either. I thought that the
> > 'repl' argument (which I think I have demonstrated is a single backslash)
> > would get put back in the returned value.
> >
> >
> >
> >> > gsub("\\\\", "Test", "C:\\")
> >> [1] "C:Test"
> >> > gsub("\\\\", "\\\\", "C:\\")
> >> [1] "C:\\"
> >
> >
> > I thought the parsing rules for 'replacement' were different than the rules
> > for 'patt'. So I'm puzzled, too. Maybe something changed in 2.14?
> >
> >> sub("\\\\", "\\", "C:\\", fixed=TRUE)
> > [1] "C:\\"
> >
> >> sub("\\\\", "\\", "C:\\")
> > [1] "C:"
> >> sub("([\\])", "\\1", "C:\\")
> > [1] "C:\\"
> >
> > The NEWS file does say that there is a new regular expression implementation
> > and that the help file for regex should be consulted.
> >
> > And presumably we should study this:
> >
> > http://laurikari.net/tre/documentation/regex-syntax/
> >
> >  In the 'replacement' argument, the "\\" is used to back-reference a
> > numbered sub-pattern, so perhaps "\\" is now getting handled as the "null
> > subpattern"? I don't see that mentioned in the regex help page, but it is a
> > big "page". I also didn't see "\\" referenced in the TRE documentation, but
> > then again I don't think that "\\" in console or source() input is a double
> > backslash. The TRE document says that "A \ cannot be the last character of
> > an ERE." I cannot tell whether that rule gets applied to the 'replacement'.
> >
> >
> >>
> >>
> >> I have observed similar behaviour for fixed=TRUE and perl=TRUE. I use R
> >> 2.14.1 64-bit on Windows 7.
> >
> >
> >
> > --
> > David Winsemius, MD
> > West Hartford, CT
> >
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538280 at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list