[Rd] The regular expressions in compareVersion()

Duncan Murdoch murdoch.duncan at gmail.com
Fri Apr 25 14:04:34 CEST 2014


On 24/04/2014, 10:27 PM, Simon Urbanek wrote:
> FWIW the link has a long thread that is 90% irrelevant - AFAICS the relevant part is
>
> From: Yihui Xie-2
> Sep 02, 2013; 4:11pm
> Re: Sweave: printing an underscore in the output from an R command
> [...]
> Now you are good at the regular expression level, but Sweave comes and
> bites you, and that is due to this bug in the regular expression in
> Sweave Noweb syntax:
>
>> SweaveSyntaxNoweb$docexpr
> [1] "\\\\Sexpr\\{([^\\}]*)\\}"
>
> It should have been "\\\\Sexpr\\{([^}]*)\\}", i.e. } does not need to
> be escaped inside [], and \\ will be interpreted literally inside [].
> In your case, Sweave sees \ in \Sexpr{}, and the regular expression
> stops matching there, and is unable to see } after \, so it believes
> there is no inline R expressions in your document.
>

Thanks.  I've put in a bug report on this one now, so it shouldn't get 
missed again.  If nobody else gets to it first I'll deal with it.

I don't see any value in fixing the compareVersion example, but if 
someone submits a bug report about it, someone else might fix it.

Duncan Murdoch

>
> On Apr 24, 2014, at 10:15 PM, Yihui Xie <xie at yihui.name> wrote:
>
>> You are right that this is unlikely to cause problems, because users
>> are unlikely to put backslashes in version numbers. Henrik has pointed
>> out the problem. It is not about "making the source code a little
>> cleaner", but "making it correct". Either someone in R core corrects
>> the wrong regular expressions in a few seconds (unless you think \ can
>> be a legal character in a version number), or I just give up the
>> report. It seems the latter is easier. It is not worth additional
>> Q&A's back and forth.
>>
>> Regarding the regular expression problem for \Sexpr{} in Sweave,
>> please see here for a record:
>> http://r.789695.n4.nabble.com/Sweave-printing-an-underscore-in-the-output-from-an-R-command-td4675177.html
>> As I said, it is a similar problem: someone tried to escape a
>> character that did not need to be escaped in [].
>>
>> Regards,
>> Yihui
>> --
>> Yihui Xie <xieyihui at gmail.com>
>> Web: http://yihui.name
>>
>>
>> On Thu, Apr 24, 2014 at 6:20 PM, Duncan Murdoch
>> <murdoch.duncan at gmail.com> wrote:
>>> On 24/04/2014, 5:26 PM, Henrik Bengtsson wrote:
>>>>
>>>> On Thu, Apr 24, 2014 at 1:42 PM, Duncan Murdoch
>>>> <murdoch.duncan at gmail.com> wrote:
>>>>>
>>>>> On 24/04/2014, 1:11 PM, Yihui Xie wrote:
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I guess the backslash should not be used as the separator for
>>>>>> strsplit() in compareVersion(), because the period in [.] is no longer
>>>>>> a metacharacter (no need to "escape" it using a backslash):
>>>>>>
>>>>>>
>>>>>> https://github.com/wch/r-source/blob/trunk/src/library/utils/R/packages.R#L866-L867
>>>>>>
>>>>>>> compareVersion
>>>>>>
>>>>>>
>>>>>> function (a, b)
>>>>>> {
>>>>>> ....
>>>>>>       a <- as.integer(strsplit(a, "[\\.-]")[[1L]])
>>>>>>       b <- as.integer(strsplit(b, "[\\.-]")[[1L]])
>>>>>> ....
>>>>>> <environment: namespace:utils>
>>>>>
>>>>>
>>>>>
>>>>> Could you post an example where this causes trouble, or are you just
>>>>> suggesting this as a way to make the source a little cleaner?
>>>>
>>>>
>>>> Maybe it's already clear, but [\\.] is the set for the two symbols '\'
>>>> and '.', not '.' alone.  For example, I would expect an error below:
>>>>
>>>>> compareVersion("3.14-59.26", "3.14-59\\26")
>>>>
>>>> [1] 0
>>>>
>>>
>>> How does that cause problems?
>>>
>>> Duncan Murdoch
>>>
>>>
>>>> /Henrik
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> A similar regular expression problem also exists in the Sweave syntax
>>>>>> (for \Sexpr{}), and I have reported it once. It was fixed but the fix
>>>>>> was immediately reverted for some reason:
>>>>>>
>>>>>>
>>>>>> https://github.com/wch/r-source/commit/52b0a46e15136a7f9e4777e9960fdda6d84880c0
>>>>>
>>>>>
>>>>>
>>>>> A link to your report would be more useful, if it included an example
>>>>> where
>>>>> the bad regexp causes trouble.
>>>>>
>>>>> Duncan Murdoch
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>



More information about the R-devel mailing list