[R] grep and gsub on backslash and quotes
Peter Dalgaard BSA
p.dalgaard at biostat.ku.dk
Tue Aug 12 18:21:40 CEST 2003
"Simon Fear" <Simon.Fear at synequanon.com> writes:
> The following code works, to gsub single quotes to double quotes:
> line <- gsub("'", '"', line)
> (that's a single quote within doubles then a double within singles if
> viewer's font is not good).
> But The R Language Manual tells me that
> Quotes and other special characters within strings
> are specified using escape sequences:
> \' single quote
> \" double quote
> so why is the following wrong: gsub("\\\\'", "\\\\"", line)? That or any
> other number of backslashes (have tried all up to n=6 just for good
There's a backslash missing in the replacement. This works:
line <- "ab\\\'cd"
gsub("\\\\'", "\\\\\"", line)
and will replace \' with \"
> BTW is it documented anywhere that you need four backslashes in an RE to
> match one in the target, when it is being passed as an argument to gsub
> grep? How would I know how many levels of doubling up to use for any
> functions? (I got to 4 consecutive \ by trial and error in this case,
> have a dim memory of having read about it somewhere.)
There are two levels because backslashes are escape characters both to
R strings and regular expressions. So in the above, "line" is
and the match pattern is
\\' which matches \'
and the replacement is
\\" which becomes \"
More interesting is
> gsub("\\'", "a", line)
> gsub("\\'", "a", line, perl=T)
so \' matches a single quote with PCRE but not with ordinary RE. (Yes,
there's a reason...)
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help