[Rd] behavior of as.integer("5000000000")

Martin Maechler maechler at lynne.stat.math.ethz.ch
Fri Apr 17 15:49:35 CEST 2015


>>>>> Hervé Pagès <hpages at fredhutch.org>
>>>>>     on Mon, 13 Apr 2015 23:36:14 -0700 writes:

    > On 04/13/2015 11:32 PM, Martin Maechler wrote:
    >> 
    >>> Hi,
    >>> > as.integer("5000000000")
    >>> [1] 2147483647
    >>> Warning message:
    >>> inaccurate integer conversion in coercion
    >> 
    >>> > as.integer("-5000000000")
    >>> [1] NA
    >>> Warning message:
    >>> inaccurate integer conversion in coercion
    >> 
    >>> Is this a bug or a feature? The man page suggests it's the
    >>> latter:
    >> 
    >> I think you mean the "former", a bug.
    >> 
    >> and I agree entirely, see the following  " 2 x 2 " comparison :
    >> 
    >> > N <- 5000000000000 * 8^-(0:7)
    >> > as.integer(N)
    >> [1]         NA         NA         NA         NA 1220703125  152587890   19073486    2384185
    >> Warning message:
    >> NAs introduced by coercion
    >> > as.integer(-N)
    >> [1]          NA          NA          NA          NA -1220703125  -152587890   -19073486
    >> [8]    -2384185
    >> Warning message:
    >> NAs introduced by coercion
    >> > as.integer(as.character(N))
    >> [1] 2147483647 2147483647 2147483647 2147483647 1220703125  152587890   19073486    2384185
    >> Warning message:
    >> inaccurate integer conversion in coercion
    >> > as.integer(as.character(-N))
    >> [1]          NA          NA          NA          NA -1220703125  -152587890   -19073486
    >> [8]    -2384185
    >> Warning message:
    >> inaccurate integer conversion in coercion
    >> 
    >> 
    >> 
    >>> ‘as.integer’ attempts to coerce its argument to be of integer
    >>> type.  The answer will be ‘NA’ unless the coercion succeeds.
    >> 
    >>> even though someone could always argue that coercion of "5000000000"
    >>> succeeded (for some definition of "succeed").
    >> 
    >>> Also is there any reason why the warning message is different than
    >>> with:
    >> 
    >>> > as.integer(-5000000000)
    >>> [1] NA
    >>> Warning message:
    >>> NAs introduced by coercion
    >> 
    >>> In the case of as.integer("-5000000000"), it's not really that the
    >>> conversion was "inaccurate", it's a little bit worse than that. And
    >>> knowing that NAs where introduced by coercion is important.
    >> 
    >> Yes.
    >> The message is less a problem than the bug, but I agree we
    >> should try to improve it.

    > Sounds good. Thanks Martin,

I've committed a change to R-devel now, such that also this case
returns NA with a warning, actually for the moment with both the
old warning and the   'NAs introduced by coercion' warning.
The "nice thing" about the old warning is that it explicitly
mentions integer coercion.

I currently think we should keep that property, and I'd propose
to completely drop the 
   "inaccurate integer conversion in coercion"
warning (it is not used anywhere else currently) and replace it
in this and other as.integer(.) cases with

  'NAs introduced by integer coercion'

(or something similar. ... improvements / proposals are welcome).

BTW, the fact that as.integer("-5000000000") did produce an NA
instead of -2147483647 so it would have been compatible with as.integer("5000000000")
was just another coincidence, namely that we "currently" code NA_integer_
by INT_MIN (for 32 bit integers, INT_MIN = 2147483648 = 2^31)
[[but your C code must not rely on that, it is an implementation detail!]]

Martin



More information about the R-devel mailing list