[R] regular expression change in R version 2.3.0?

Thomas Girke thomas.girke at ucr.edu
Sat May 6 19:36:20 CEST 2006


The interpretation of regular expressions with repetition
quantifiers in the 'gregexpr' function seems to have changed 
between R Version 2.2.0 and 2.3.0. The 'gsub' function, however, 
gives the same results in R Versions 2.2.0 and 2.3.0. Below is 
an example that demonstrates the version differences of the 
'gregexpr' function. I am not sure whether this new behavior 
is an intended change or represents a bug. Personally, I liked
the old behavior of this function more useful, since it is 
consistent with the Perl regular expressions.

Here are my questions:
(1) Is there a possibility to obtain from 'gregexpr' 
the old output of R version 2.2.0 when using regular 
expressions with repetition quantifiers.

(2) How can one be informed about regular expression changes 
and the associated functions in new versions of R? 


Here is the example code to demonstrate the version difference 
of the 'gregexpr' function between versions 2.3.0 and 2.2.0:

# Example string
x <- "xaaaaxaaaax"

# gregexpr in Version 2.2.0  (2005-10-06 r35749)
gregexpr("[a]{1,}", as.character(x), perl=T)
[[1]]
[1] 2 7
attr(,"match.length")
[1] 4 4

# gregexpr in Version 2.3.0 (2006-04-24)
gregexpr("[a]{1,}", as.character(x), perl=T)
[[1]]
[1]  2  3  4  5  7  8  9 10
attr(,"match.length")
[1] 4 3 2 1 4 3 2 1

# gsub gives expected output in Versions 2.2.0 & 2.3.0
gsub("[a]{1,}", "_", as.character(x), perl=T)
[1] "x_x_x"


Thanks in advance for your help.

Thomas

-- 
Thomas Girke, Ph.D.
1008 Noel T. Keen Hall
Center for Plant Cell Biology (CEPCEB)
University of California
Riverside, CA 92521

E-mail: thomas.girke at ucr.edu
Ph: 951-827-2469
Fax: 951-827-4437




More information about the R-help mailing list