[Rd] (PR#13283) R crashes on sprintf with bad format

ripley at stats.ox.ac.uk ripley at stats.ox.ac.uk
Fri Nov 14 19:30:09 CET 2008


But %S is not valid in C99 or POSIX, even if it is a variant in some 
systems.

I am working on a more careful checker right now, but there will be limits 
to what we can catch: this was already a pretty rare example.

Brian

On Fri, 14 Nov 2008, William Dunlap wrote:

>> From: r-devel-bounces at r-project.org
>> [mailto:r-devel-bounces at r-project.org] On Behalf Of Prof Brian Ripley
>> Sent: Friday, November 14, 2008 2:25 AM
>> To: Duncan Murdoch
>> Cc: R-bugs at r-project.org; ocheyett at bonddesk.com;
>> r-devel at stat.math.ethz.ch
>> Subject: Re: [Rd] (PR#13283) R crashes on sprintf with bad
>> format specification
>>
>> As R's sprintf is a wrapper for the OS's sprintf, misuse does
>> run the risk of crashing from OS, and when it does the error
>> will come from the implementation of sprintf (which for R for
>> Windows is the Trio library).
>> One could argue that the OS service should not segfault on
>> incorrect input, but this one often does.
>>
>> The place to add checks is in the R wrapper code (presumably
>> in the C component).  We do have checks there, but it is hard
>> to imagine just what errors users will make, and this one has
>> slipped through the checks.
>
> The following change recognizes 'S' as a format type (but doesn't
> do anything with it) and thus avoids the crash.
>
> --- sprintf.c   (revision 46944)
> +++ sprintf.c   (working copy)
> @@ -103,7 +103,7 @@
>                else {
>                    /* recognise selected types from Table B-1 of K&R */
>                    /* This is MBCS-OK, as we are in a format spec */
> -                   chunk = strcspn(formatString + cur + 1,
> "aAdisfeEgGxX%") + 2;
> +                   chunk = strcspn(formatString + cur + 1,
> "aAdisSfeEgGxX%") + 2;
>                    if (cur + chunk > n)
>                        error(_("unrecognised format at end of
> string"));
>
> The resulting error messages are not terribly direct, but nicer
> than the crash
>   > sprintf("A %S %S %S XYZ", 1, 1, 1)
>   Error in sprintf("A %S %S %S XYZ", 1, 1, 1) :
>     use format %f, %e, %g or %a for numeric objects
>   > sprintf("A %S %S %S XYZ", "foo", "bar", "foo")
>   Error in sprintf("A %S %S %S XYZ", "foo", "bar", "foo") :
>     use format %s for character objects
>
> One could add 'S' to some switch cases below there to have
> it convert the corresponding argument to a wide character string
> or say directly that %S is not allowed.
>
> This piece of code finds the conversion type by looking for this
> first of a bunch of known conversion characters after the %, which
> doesn't find lots of illegal formats and causes some surprises.  E.g.,
> because 'p' is not in that list we get:
>   > sprintf("%p", 3)
>   Error in sprintf("%p", 3) : unrecognised format at end of string
>   > sprintf("%p is a pointer!", 3)
>   [1] "0x3 is a pointer!"
>   > sprintf("%p %o", 17L, 17L)
>   [1] "0x4 o"
>
>> Here the issue is the use of %S, an unknown format for
>> numbers, followed by 'X'.  There is a check for a suitable
>> format, but it is not strict enough.  And it is non-trivial
>> to write a printf format parser just to check for user error
>> (and potentially it will slow down correct uses:
>> speed in sprintf is important in some applications).
>>
>> So I am not sure this is something that should be R's
>> responsibility to
>> fix: maybe just add a warning to the help page?
>>
>>
>> On Thu, 13 Nov 2008, Duncan Murdoch wrote:
>>
>>> On 12/11/2008 8:30 PM, ocheyett at bonddesk.com wrote:
>>>> Full_Name: Oren Cheyette
>>>> Version: 2.7.2
>>>> OS: Win XP
>>>> Submission from: (NULL) (64.161.123.194)
>>>>
>>>>
>>>> Enter the following at the R command prompt:
>>>>> sprintf("A %S %S %S XYZ", 1, 1, 1);
>>
>> R is not C, and the empty command at the end of that line (after the
>> semicolon) is not relevant.
>>
>>>> Note the erroneous capitalized %S instead of %s and the
>> numeric inputs
>>>> instead
>>>> of strings. With strings there's no crash - R reports bad format
>>>> specifications.
>>>
>>> 2.7.2 is obsolete, but I can confirm a crash on Windows
>> with a recent
>>> R-devel.
>>>
>>> The last few entries in the stack dump at the time of the
>> crash are shown
>>> below; these make it look as though the problem is in the
>> Trio library, so it
>>> may be hard to fix.
>>>
>>> Duncan Murdoch
>>>
>>>
>>> Rgui.exe caused an Access Violation at location 6c913fb3 in
>> module R.dll
>>> Reading from location 00000001.
>>>
>>> Registers:
>>> eax=7fffffff ebx=00000000 ecx=00e17b21 edx=00000001
>> esi=00e1c83b edi=0000000a
>>> eip=6c913fb3 esp=00e17944 ebp=00e17b3c iopl=0         nv up
>> ei pl nz na po nc
>>> cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000 efl=00000206
>>>
>>> Call stack:
>>> 6C913FB3  R.dll:6C913FB3  TrioFormatProcess  trio.c:2724
>>>
>>> 	...
>>> 	  while (length > 0)
>>> 	    {
>>>> 	      size = TrioWriteWideStringCharacter(self,
>> *wstring++, flags,
>>> length);
>>> 	      if (size == 0)
>>> 	break; /* while */
>>> 	...
>>>
>>> 6C916592  R.dll:6C916592  trio_vsprintf  trio.c:3771
>>>
>>> 	...
>>> 	    return status;
>>>
>>>> 	  status = TrioFormatProcess(&data, format, parameters);
>>> 	  if (data.error != 0)
>>> 	    {
>>> 	...
>>>
>>> 6C911F62  R.dll:6C911F62  sprintf  compat.c:46
>>>
>>> 	...
>>> 	    va_end(ap);
>>> 	    return res;
>>>> 	}
>>>
>>>
>>> 	...
>>>
>>> 6C889F1E  R.dll:6C889F1E  do_sprintf  sprintf.c:297
>>>
>>> 	...
>>> 	    sprintf(bit, fmtp, " NaN");
>>> 	else
>>>> 	    sprintf(bit, fmtp, "NaN");
>>> 	    } else if (x == R_PosInf) {
>>> 	if (strcspn(fmtp, "+") < strlen(fmtp))
>>> 	...
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> --
>> Brian D. Ripley,                  ripley at stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford,             Tel:  +44 1865 272861 (self)
>> 1 South Parks Road,                     +44 1865 272866 (PA)
>> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>>
>> ______________________________________________
>> R-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list