[R] gsub regexp question

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Wed Oct 15 15:27:33 CEST 2014


I believe the backslash is not considered an escape character by the extended RE library used by R (perl=FALSE), so it is being treated as a literal. This means that the last ] is outside the character class and is the atom that the * applies to.

 gsub("^([[:alnum:]\\[\\]]*).*", "\\1", "a]]]rray[n] <- 10", perl=FALSE)

yields

"a]]]"

(Using F in place of FALSE is bad form.)
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On October 15, 2014 4:42:00 AM PDT, ALBERTO VIEIRA FERREIRA MONTEIRO <albmont at centroin.com.br> wrote:
>I just found a curious behaviour of regexp and I'd like to share with
>y'all.
>
>gsub("^([[:alnum:]\\[\\]]*).*", "\\1", "array[n] <- 10", perl=T) #
>works as expected ("array[n]")
>
>gsub("^([[:alnum:]\\[\\]]*).*", "\\1", "array[n] <- 10", perl=F) #
>doesn't work ("a")
>
>I didn't find anything in the documentation explain what's going on,
>and why the second gsub doesn't work.
>
>Alberto Monteiro
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list