[R] gsub: replacing double backslashes with single backslash

Ista Zahn istazahn at gmail.com
Thu Mar 8 15:13:09 CET 2012


Thanks all, I realized the error of my ways last night. Apparently my
brain was on vacation yesterday. Thanks to all for going through this
yet again!

Best,
Ista

On Thu, Mar 8, 2012 at 1:08 AM, Daniel Nordlund <djnordlund at frontier.com> wrote:
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>> On Behalf Of Ista Zahn
>> Sent: Wednesday, March 07, 2012 6:55 PM
>> To: Greg Snow
>> Cc: r-help at r-project.org; Markus Elze
>> Subject: Re: [R] gsub: replacing double backslashes with single backslash
>>
>> On Wed, Mar 7, 2012 at 12:57 PM, Greg Snow <538280 at gmail.com> wrote:
>> >
>> > The issue here is the difference between what is contained in a string
>> > and what R displays to you.
>> >
>> > The string produced with the code:
>> >
>> > > tmp <- "C:\\"
>> >
>> > only has 3 characters (as David pointed out), the third of which is a
>> > single backslash, since the 1st \ escapes the 2nd and the R string
>> > parsing rules use the combination to put a sing backslash in the
>> > string.  When you print the string (whether you call print directly or
>> > indirectly) the print function escapes special characters, including
>> > the backslash, so you see "\\" which represents a single backslash in
>> > the string.  If you use the cat function instead of the print
>> > function, then you will only see a single backslash (and other escape
>> > sequences such as \n will also display different in print vs. cat
>> > output).  There are other ways to see the exact string (write to a
>> > file, use in certain command, etc.) but cat is probably the simplest.
>>
>>
>> Fine, but how does this help the OP (and me!) figure out how to
>> replace "C:\\" with "C:\" ?
>>
>> Best,
>> Ista
>
> Ista,
>
> you have received some good descriptions / explanations of what is going on, but you don't seem to have digested it yet.  I don't blame you, I found this difficult myself when I first encountered this.  One needs to keep distinct what is actually contained in a string, and how R chooses to display it under various circumstances.  Consider the example again
>
>>tmp <- "C:\\"
>
> the variable tmp contains only three characters: 1. a capital C, 2. a colon, and 3. a single backslash.  You can tell it only has three characters like this
>
>> nchar(tmp)
> [1] 3
>
> If you use cat() to display the contents you will also see that it only has three characters (I included the newline character to force a newline; print() does it automatically, but cat() doesn't)
>
>> cat(tmp, '\n')
> C:\
>
> So again we see just three characters.  However, if we display the variable with print, we will see two backslashes even though there is actually only one backslash in the variable.
>
>> print(tmp)
> [1] "C:\\"
>
> So when you ask, 'Fine, but how does this help the OP (and me!) figure out how to replace "C:\\" with "C:\"?', you need to be clear about whether you are talking about a string which displays with two backslashes, or a string that actually has two consecutive backslashes, which print() will display as four consecutive backslashes.  If you are talking about a variable, tmp, that actually has two backslashes in it, then it will display like this
>
>> tmp
> [1] "C:\\\\"
>> print(tmp)
> [1] "C:\\\\"
>> cat(tmp,'\n')
> C:\\
>
> If you then want to change it so that it has only 1 backslash in it, you could do
>
>> tmp <- sub('\\\\', '\\', tmp)
>> tmp
> [1] "C:\\"
>> print(tmp)
> [1] "C:\\"
>> cat(tmp,'\n')
> C:\
>
>
> Hope this is helpful,
>
> Dan
>
> Daniel Nordlund
> Bothell, WA USA
>
>
>



More information about the R-help mailing list