[R] regex problems with the escape character

Gabor Grothendieck ggrothendieck at gmail.com
Tue Aug 18 10:45:39 CEST 2009


On Mon, Aug 17, 2009 at 6:12 AM, Gabor
Grothendieck<ggrothendieck at gmail.com> wrote:
> On Mon, Aug 17, 2009 at 3:21 AM, ravi<rv15i at yahoo.se> wrote:
>> Hi R-users and R-experts,
>> I am having a hard time in figuring out how to tackle regex questions where the "backslash" character is an integral part of the string. Let me explain how I came across this problem :
>> I wanted to clearly see all the components in the windows environmental path variable. This is a long string. For easy readability, I wanted to split up this string so that each component starts on a new line. But I ran smack into a problem in the very first step - in reading in the string. Look at the following code :
>>
>> ##### path variable has been shortened to show only the first few components
>> s1<- "C:\Rtools\bin;C:\Rtools\perl\bin;C:\Rtools\MinGW\bin;C:\Program\MiKTeX 2.7\miktex\bin;"
>> s2<-gsub(";",";\n",s1)
>> cat(s2,"\n")
>>
>> I get the following warning messages after the first line :
>> Warning messages:
>> 1: '\R' is an unrecognized escape in a character string
>> 2: '\R' is an unrecognized escape in a character string
>> 3: '\p' is an unrecognized escape in a character string
>> 4: '\R' is an unrecognized escape in a character string
>> 5: '\M' is an unrecognized escape in a character string
>> 6: '\P' is an unrecognized escape in a character string
>> 7: '\M' is an unrecognized escape in a character string
>> 8: '\m' is an unrecognized escape in a character string
>> 9: unrecognized escapes removed from "C:\Rtools\bin;C:\Rtools\perl\bin;C:\Rtools\MinGW\bin;C:\Program\MiKTeX 2.7\miktex\bin;"
>>
>> I thought about attempting to escape the escape character and so on. Is that a workable option here? However, for this to work, I must first be able to read in the string correctly. The simplest solution, if it is at all possible, would be to temporarily change the escape character. To "%", for example. Is such a declaration possible? Other alternatives?
>> I would appreciate help in understanding different ways of solving this problem.
>> Thanking you,
>> Ravi
>>
>
> s1 should use \\ like this:
>
> s1<- "C:\\Rtools\\bin;C:\\Rtools\\perl\\bin;C:\\Rtools\\MinGW\\bin;C:\\Program\\MiKTeX
> 2.7\\miktex\bin"
>
> Also 1. you won't need to change your path in the first place if you use
> the batchfiles at http://batchfiles.googlecode.com
> 2. there is also a free path editor discussed there. See Q4 or Q8
> in the troubleshooting faq on that same page.
>

A few more comments.  R is not like perl here.

Within quotes \ always escapes the next character and
you can't specify an alternative escape character.  You
can do this:

> x <- scan(what = "")
1: C:\Rtools\bin;C:\Rtools\perl\bin;
2:
Read 1 item

but that only works if you enter it directly into the R console
and not within a sourced file. This also works

x <- readLines("myfile.txt")

if myfile.txt is the name of a file containing the indicated string.

I agree this is limiting and has been the source of numerous
enhancement requests.

The gsubfn package does have one perl-like feature.
Just preface any command by fn$ and it will perform
string interpolation on certain arguments in a manner
analogous to perl.  See http://gsubfn.googlecode.com

> library(gsubfn)
> fn$cat("This is pi= $pi and this is pi+1 = `pi+1`\n")
This is pi= 3.14159265358979 and this is pi+1 = 4.14159265358979




More information about the R-help mailing list