[Rd] (PR#13283) R crashes on sprintf with bad format	specification
    wdunlap at tibco.com 
    wdunlap at tibco.com
       
    Fri Nov 14 19:25:15 CET 2008
    
    
  
> From: r-devel-bounces at r-project.org=20
> [mailto:r-devel-bounces at r-project.org] On Behalf Of Prof Brian Ripley
> Sent: Friday, November 14, 2008 2:25 AM
> To: Duncan Murdoch
> Cc: R-bugs at r-project.org; ocheyett at bonddesk.com;=20
> r-devel at stat.math.ethz.ch
> Subject: Re: [Rd] (PR#13283) R crashes on sprintf with bad=20
> format specification
>=20
> As R's sprintf is a wrapper for the OS's sprintf, misuse does=20
> run the risk of crashing from OS, and when it does the error=20
> will come from the implementation of sprintf (which for R for=20
> Windows is the Trio library).
> One could argue that the OS service should not segfault on=20
> incorrect input, but this one often does.
>=20
> The place to add checks is in the R wrapper code (presumably=20
> in the C component).  We do have checks there, but it is hard=20
> to imagine just what errors users will make, and this one has=20
> slipped through the checks.
The following change recognizes 'S' as a format type (but doesn't
do anything with it) and thus avoids the crash. =20
--- sprintf.c   (revision 46944)
+++ sprintf.c   (working copy)
@@ -103,7 +103,7 @@
                else {
                    /* recognise selected types from Table B-1 of K&R */
                    /* This is MBCS-OK, as we are in a format spec */
-                   chunk =3D strcspn(formatString + cur + 1,
"aAdisfeEgGxX%") + 2;
+                   chunk =3D strcspn(formatString + cur + 1,
"aAdisSfeEgGxX%") + 2;
                    if (cur + chunk > n)
                        error(_("unrecognised format at end of
string"));
The resulting error messages are not terribly direct, but nicer
than the crash
   > sprintf("A %S %S %S XYZ", 1, 1, 1)
   Error in sprintf("A %S %S %S XYZ", 1, 1, 1) :
     use format %f, %e, %g or %a for numeric objects
   > sprintf("A %S %S %S XYZ", "foo", "bar", "foo")
   Error in sprintf("A %S %S %S XYZ", "foo", "bar", "foo") :
     use format %s for character objects
One could add 'S' to some switch cases below there to have
it convert the corresponding argument to a wide character string
or say directly that %S is not allowed.
This piece of code finds the conversion type by looking for this
first of a bunch of known conversion characters after the %, which
doesn't find lots of illegal formats and causes some surprises.  E.g.,
because 'p' is not in that list we get:
   > sprintf("%p", 3)
   Error in sprintf("%p", 3) : unrecognised format at end of string
   > sprintf("%p is a pointer!", 3)
   [1] "0x3 is a pointer!"
   > sprintf("%p %o", 17L, 17L)
   [1] "0x4 o"
> Here the issue is the use of %S, an unknown format for=20
> numbers, followed by 'X'.  There is a check for a suitable=20
> format, but it is not strict enough.  And it is non-trivial=20
> to write a printf format parser just to check for user error=20
> (and potentially it will slow down correct uses:=20
> speed in sprintf is important in some applications).
>=20
> So I am not sure this is something that should be R's=20
> responsibility to
> fix: maybe just add a warning to the help page?
>=20
>=20
> On Thu, 13 Nov 2008, Duncan Murdoch wrote:
>=20
> > On 12/11/2008 8:30 PM, ocheyett at bonddesk.com wrote:
> >> Full_Name: Oren Cheyette
> >> Version: 2.7.2
> >> OS: Win XP
> >> Submission from: (NULL) (64.161.123.194)
> >>=20
> >>=20
> >> Enter the following at the R command prompt:
> >>> sprintf("A %S %S %S XYZ", 1, 1, 1);
>=20
> R is not C, and the empty command at the end of that line (after the=20
> semicolon) is not relevant.
>=20
> >> Note the erroneous capitalized %S instead of %s and the=20
> numeric inputs=20
> >> instead
> >> of strings. With strings there's no crash - R reports bad format
> >> specifications.
> >
> > 2.7.2 is obsolete, but I can confirm a crash on Windows=20
> with a recent=20
> > R-devel.
> >
> > The last few entries in the stack dump at the time of the=20
> crash are shown=20
> > below; these make it look as though the problem is in the=20
> Trio library, so it=20
> > may be hard to fix.
> >
> > Duncan Murdoch
> >
> >
> > Rgui.exe caused an Access Violation at location 6c913fb3 in=20
> module R.dll=20
> > Reading from location 00000001.
> >
> > Registers:
> > eax=3D7fffffff ebx=3D00000000 ecx=3D00e17b21 edx=3D00000001=20
> esi=3D00e1c83b edi=3D0000000a
> > eip=3D6c913fb3 esp=3D00e17944 ebp=3D00e17b3c iopl=3D0         nv up=20
> ei pl nz na po nc
> > cs=3D001b  ss=3D0023  ds=3D0023  es=3D0023  fs=3D003b  gs=3D0000 =
efl=3D00000206
> >
> > Call stack:
> > 6C913FB3  R.dll:6C913FB3  TrioFormatProcess  trio.c:2724
> >
> > 	...
> > 	  while (length > 0)
> > 	    {
> >> 	      size =3D TrioWriteWideStringCharacter(self,=20
> *wstring++, flags,=20
> > length);
> > 	      if (size =3D=3D 0)
> > 	break; /* while */
> > 	...
> >
> > 6C916592  R.dll:6C916592  trio_vsprintf  trio.c:3771
> >
> > 	...
> > 	    return status;
> >
> >> 	  status =3D TrioFormatProcess(&data, format, parameters);
> > 	  if (data.error !=3D 0)
> > 	    {
> > 	...
> >
> > 6C911F62  R.dll:6C911F62  sprintf  compat.c:46
> >
> > 	...
> > 	    va_end(ap);
> > 	    return res;
> >> 	}
> >
> >
> > 	...
> >
> > 6C889F1E  R.dll:6C889F1E  do_sprintf  sprintf.c:297
> >
> > 	...
> > 	    sprintf(bit, fmtp, " NaN");
> > 	else
> >> 	    sprintf(bit, fmtp, "NaN");
> > 	    } else if (x =3D=3D R_PosInf) {
> > 	if (strcspn(fmtp, "+") < strlen(fmtp))
> > 	...
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>=20
> --=20
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>=20
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>=20
    
    
More information about the R-devel
mailing list