[Rd] behavior of as.integer("5000000000")

Hervé Pagès hpages at fredhutch.org
Fri Apr 17 19:12:02 CEST 2015



On 04/17/2015 08:24 AM, Martin Maechler wrote:
>>>>>> Martin Maechler <maechler at lynne.stat.math.ethz.ch>
>>>>>>      on Fri, 17 Apr 2015 15:49:35 +0200 writes:
>
>>>>>> Hervé Pagès <hpages at fredhutch.org>
>>>>>>      on Mon, 13 Apr 2015 23:36:14 -0700 writes:
>
>      >> On 04/13/2015 11:32 PM, Martin Maechler wrote:
>      >>>
>      >>>> Hi,
>      >>>> > as.integer("5000000000")
>      >>>> [1] 2147483647
>      >>>> Warning message:
>      >>>> inaccurate integer conversion in coercion
>      >>>
>      >>>> > as.integer("-5000000000")
>      >>>> [1] NA
>      >>>> Warning message:
>      >>>> inaccurate integer conversion in coercion
>      >>>
>      >>>> Is this a bug or a feature? The man page suggests it's the
>      >>>> latter:
>      >>>
>      >>> I think you mean the "former", a bug.
>      >>>
>      >>> and I agree entirely, see the following  " 2 x 2 " comparison :
>      >>>
>      >>> > N <- 5000000000000 * 8^-(0:7)
>      >>> > as.integer(N)
>      >>> [1]         NA         NA         NA         NA 1220703125  152587890   19073486    2384185
>      >>> Warning message:
>      >>> NAs introduced by coercion
>      >>> > as.integer(-N)
>      >>> [1]          NA          NA          NA          NA -1220703125  -152587890   -19073486
>      >>> [8]    -2384185
>      >>> Warning message:
>      >>> NAs introduced by coercion
>      >>> > as.integer(as.character(N))
>      >>> [1] 2147483647 2147483647 2147483647 2147483647 1220703125  152587890   19073486    2384185
>      >>> Warning message:
>      >>> inaccurate integer conversion in coercion
>      >>> > as.integer(as.character(-N))
>      >>> [1]          NA          NA          NA          NA -1220703125  -152587890   -19073486
>      >>> [8]    -2384185
>      >>> Warning message:
>      >>> inaccurate integer conversion in coercion
>      >>>
>      >>>
>      >>>
>      >>>> ‘as.integer’ attempts to coerce its argument to be of integer
>      >>>> type.  The answer will be ‘NA’ unless the coercion succeeds.
>      >>>
>      >>>> even though someone could always argue that coercion of "5000000000"
>      >>>> succeeded (for some definition of "succeed").
>      >>>
>      >>>> Also is there any reason why the warning message is different than
>      >>>> with:
>      >>>
>      >>>> > as.integer(-5000000000)
>      >>>> [1] NA
>      >>>> Warning message:
>      >>>> NAs introduced by coercion
>      >>>
>      >>>> In the case of as.integer("-5000000000"), it's not really that the
>      >>>> conversion was "inaccurate", it's a little bit worse than that. And
>      >>>> knowing that NAs where introduced by coercion is important.
>      >>>
>      >>> Yes.
>      >>> The message is less a problem than the bug, but I agree we
>      >>> should try to improve it.
>
>      >> Sounds good. Thanks Martin,
>
>      > I've committed a change to R-devel now, such that also this case
>      > returns NA with a warning, actually for the moment with both the
>      > old warning and the   'NAs introduced by coercion' warning.
>      > The "nice thing" about the old warning is that it explicitly
>      > mentions integer coercion.
>
>      > I currently think we should keep that property, and I'd propose
>      > to completely drop the
>      > "inaccurate integer conversion in coercion"
>      > warning (it is not used anywhere else currently) and replace it
>      > in this and other as.integer(.) cases with
>
>      > 'NAs introduced by integer coercion'
>
>      > (or something similar. ... improvements / proposals are welcome).
>
> Replying to myself:
>
> I've found
>
>       'NAs introduced by coercion to integer range'

I like that we see "coercion *to* integer" instead of just
"integer coercion" because the former indicates the direction
of the coercion. I'm not that convinced with the "range" thing
though. I think

   as.integer(c("78", "a34", "-5000000000"))

should emit only one warning and not try to categorize the
reasons for getting an NA.

Thanks,
H.

>
> to be even more "on spot", and so will commit it for today.
> Of course, amendment proposals are still welcome.
>
> Martin
>
>
>
>      > BTW, the fact that as.integer("-5000000000") did produce an NA
>      > instead of -2147483647 so it would have been compatible with as.integer("5000000000")
>      > was just another coincidence, namely that we "currently" code NA_integer_
>      > by INT_MIN (for 32 bit integers, INT_MIN = 2147483648 = 2^31)
>      > [[but your C code must not rely on that, it is an implementation detail!]]
>
>      > Martin
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the R-devel mailing list