[Rd] strtoi output of empty string inconsistent across platforms

Michael Chirico mich@elchirico4 @ending from gm@il@com
Sat Jan 12 10:34:03 CET 2019


Thanks Martin.

For what it's worth, this extremely representative, highly scientific
Twitter poll suggests the Mac/Linux split is pretty stark (NA on Mac, 0 on
Linux)

https://twitter.com/michael_chirico/status/1083649190117306369?s=17

On Sat, Jan 12, 2019, 2:00 AM Martin Maechler <maechler using stat.math.ethz.ch
wrote:

> >>>>> Martin Maechler
> >>>>>     on Fri, 11 Jan 2019 09:44:14 +0100 writes:
>
> >>>>> Michael Chirico
> >>>>>     on Fri, 11 Jan 2019 14:36:17 +0800 writes:
>
>     >> Identified as root cause of a bug in data.table:
>     >> https://github.com/Rdatatable/data.table/issues/3267
>
>     >> On my machine, strtoi("", base = 2L) produces NA_integer_
>     >> (which seems consistent with ?strtoi: "Values which
>     >> cannot be interpreted as integers or would overflow are
>     >> returned as NA_integer_").
>
>     > indeed consistent with R's documentation on strtoi().
>     > What machine would that be?
>
>     >> But on all the other machines I've seen, 0L is
>     >> returned. This seems to be consistent with the output of
>     >> a simple C program using the underlying strtol function
>     >> (see data.table link for this program, and for full
>     >> sessionInfo() of some environments with differing
>     >> output).
>
>     >> So, what is the correct output of strtoi("", base = 2L)?
>
>     >> Is the cross-platform inconsistency to be
>     >> expected/documentable?
>
>     > The inconsistency is certainly undesirable.  The relevant
>     > utility function in R's source (<R>/src/main/character.c)
>     > is
>
>     > static int strtoi(SEXP s, int base) { long int res; char
>     > *endp;
>
>     >     /* strtol might return extreme values on error */
>     > errno = 0;
>
>     >     if(s == NA_STRING) return(NA_INTEGER); res =
>     > strtol(CHAR(s), &endp, base); /* ASCII */ if(errno ||
>     > *endp != '\0') res = NA_INTEGER; if(res > INT_MAX || res <
>     > INT_MIN) res = NA_INTEGER; return (int) res; }
>
>     > and so it clearly is a platform-inconsistency in the
>     > underlying C library's strtol().
>
> (corrected typos here: )
>
>     > I think we should make this cross-platform consistent ...
>     > and indeed it makes much sense to ensure the result of
>
>     >     strtoi("", base=2L)    to become   NA_integer_
>
>     > but chances are that would break code that has relied on
>     > the current behavior {on "all but your computer" ;-)} ?
>
> I still think that such a change should be done.
>
> 'make check all' on the R source (+ Recommended packages) seems
> not to signal any error or warning with such a change, so I plan
> to commit that change to "the trunk" / "R-devel" soon, unless
> concerns are raised highly (and quickly enough).
>
> Martin
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list