[Rd] gregexpr (PR#9965)

Robert Gentleman rgentlem at fhcrc.org
Thu Oct 11 17:03:50 CEST 2007


Yes, we had originally wanted it to find all matches, but user 
complaints that it did not perform as Perl does were taken to prevail. 
There are different ways to do this, but it seems the notion that one 
not start looking for the next match until after the previous one is 
more common.  I did consciously decide not to have a switch, and instead 
we wrote something that does what we wanted it to do and put it in the 
Biostrings package (from Bioconductor) as geregexpr2 (sorry but only 
fixed = TRUE is supported, since that is all we needed).

best wishes
   Robert


Prof Brian Ripley wrote:
> This was a deliberate change for R 2.4.0 with SVN log:
> 
> r38145 | rgentlem | 2006-05-20 23:58:14 +0100 (Sat, 20 May 2006) | 2 lines
> fixing gregexpr infelicity
> 
> So it seems the author of gregexpr believed that the bug was in 2.3.1, not 
> 2.5.1.
> 
> On Wed, 10 Oct 2007, dolanp at science.oregonstate.edu wrote:
> 
>> Full_Name: Peter Dolan
>> Version: 2.5.1
>> OS: Windows
>> Submission from: (NULL) (128.193.227.43)
>>
>>
>> gregexpr does not find all matching substrings if the substrings overlap:
>>
>>> gregexpr("abab","ababab")
>> [[1]]
>> [1] 1
>> attr(,"match.length")
>> [1] 4
>>
>> It does work correctly in Version 2.3.1 under linux.
> 
> 'correctly' is a matter of definition, I believe: this could be considered 
> to be vaguely worded in the help.
> 
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> 

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the R-devel mailing list