[Rd] behavior of as.integer("5000000000")

Martin Maechler maechler at lynne.stat.math.ethz.ch
Fri Apr 17 17:24:04 CEST 2015


>>>>> Martin Maechler <maechler at lynne.stat.math.ethz.ch>
>>>>>     on Fri, 17 Apr 2015 15:49:35 +0200 writes:

>>>>> Hervé Pagès <hpages at fredhutch.org>
>>>>>     on Mon, 13 Apr 2015 23:36:14 -0700 writes:

    >> On 04/13/2015 11:32 PM, Martin Maechler wrote:
    >>> 
    >>>> Hi,
    >>>> > as.integer("5000000000")
    >>>> [1] 2147483647
    >>>> Warning message:
    >>>> inaccurate integer conversion in coercion
    >>> 
    >>>> > as.integer("-5000000000")
    >>>> [1] NA
    >>>> Warning message:
    >>>> inaccurate integer conversion in coercion
    >>> 
    >>>> Is this a bug or a feature? The man page suggests it's the
    >>>> latter:
    >>> 
    >>> I think you mean the "former", a bug.
    >>> 
    >>> and I agree entirely, see the following  " 2 x 2 " comparison :
    >>> 
    >>> > N <- 5000000000000 * 8^-(0:7)
    >>> > as.integer(N)
    >>> [1]         NA         NA         NA         NA 1220703125  152587890   19073486    2384185
    >>> Warning message:
    >>> NAs introduced by coercion
    >>> > as.integer(-N)
    >>> [1]          NA          NA          NA          NA -1220703125  -152587890   -19073486
    >>> [8]    -2384185
    >>> Warning message:
    >>> NAs introduced by coercion
    >>> > as.integer(as.character(N))
    >>> [1] 2147483647 2147483647 2147483647 2147483647 1220703125  152587890   19073486    2384185
    >>> Warning message:
    >>> inaccurate integer conversion in coercion
    >>> > as.integer(as.character(-N))
    >>> [1]          NA          NA          NA          NA -1220703125  -152587890   -19073486
    >>> [8]    -2384185
    >>> Warning message:
    >>> inaccurate integer conversion in coercion
    >>> 
    >>> 
    >>> 
    >>>> ‘as.integer’ attempts to coerce its argument to be of integer
    >>>> type.  The answer will be ‘NA’ unless the coercion succeeds.
    >>> 
    >>>> even though someone could always argue that coercion of "5000000000"
    >>>> succeeded (for some definition of "succeed").
    >>> 
    >>>> Also is there any reason why the warning message is different than
    >>>> with:
    >>> 
    >>>> > as.integer(-5000000000)
    >>>> [1] NA
    >>>> Warning message:
    >>>> NAs introduced by coercion
    >>> 
    >>>> In the case of as.integer("-5000000000"), it's not really that the
    >>>> conversion was "inaccurate", it's a little bit worse than that. And
    >>>> knowing that NAs where introduced by coercion is important.
    >>> 
    >>> Yes.
    >>> The message is less a problem than the bug, but I agree we
    >>> should try to improve it.

    >> Sounds good. Thanks Martin,

    > I've committed a change to R-devel now, such that also this case
    > returns NA with a warning, actually for the moment with both the
    > old warning and the   'NAs introduced by coercion' warning.
    > The "nice thing" about the old warning is that it explicitly
    > mentions integer coercion.

    > I currently think we should keep that property, and I'd propose
    > to completely drop the 
    > "inaccurate integer conversion in coercion"
    > warning (it is not used anywhere else currently) and replace it
    > in this and other as.integer(.) cases with

    > 'NAs introduced by integer coercion'

    > (or something similar. ... improvements / proposals are welcome).

Replying to myself:

I've found 

     'NAs introduced by coercion to integer range'

to be even more "on spot", and so will commit it for today.
Of course, amendment proposals are still welcome.

Martin



    > BTW, the fact that as.integer("-5000000000") did produce an NA
    > instead of -2147483647 so it would have been compatible with as.integer("5000000000")
    > was just another coincidence, namely that we "currently" code NA_integer_
    > by INT_MIN (for 32 bit integers, INT_MIN = 2147483648 = 2^31)
    > [[but your C code must not rely on that, it is an implementation detail!]]

    > Martin



More information about the R-devel mailing list