[Rd] NA_real_ <op> NaN -> NA or NaN, should we care?

Fri May 1 15:38:55 CEST 2009

On 5/1/2009 8:14 AM, Martin Maechler wrote:
>>>>>> William Dunlap <wdunlap at tibco.com>
>>>>>>     on Thu, 30 Apr 2009 10:51:43 -0700 writes:
> 
>     > On Linux when I compile R 2.10.0(devel) (src/main/arithmetic.c in
>     > particular)
>     > with gcc 3.4.5 using the flags -g -O2 I get noncommutative behavior when
> 
> is this really gcc 3.4.5  (which is quite old) ?
> 
> Without being an expert, I'd tend to claim this to be a
> compiler (optimization) bug ....  but most probably the ANSI /
> ISO  C (and libc ?) standards would not define the exact
> behavior of arithmetic with NaNs.

Here are a few bits of information, not particularly well organized:

I don't know if the IEEE 754 (or the ISO equivalent) specifies the 
behaviour, but the hardware handling of NaNs is inconsistent on Intel 
chips.  If two NaNs are involved in an operation, it returns the one 
with the larger significand when operations are done in the FPU, it 
returns the first one when they are done in the SSE.

Our NA has a bigger significand than the default NaN, but there are NaNs 
that are bigger.  (We're unlikely to see those in the results of 
arithmetic under our control, but there's nothing to stop external code 
from returning one, or getting one via readBin from a file.)

The type can be either quiet or signalling; our NA is a signalling NaN, 
and the one Bill was getting from the as.numeric conversion is a quiet 
NaN.  However, our IsNA test doesn't look at the bits determining that 
status, it only looks at the 07a2 0000 part, so both are displayed as 
NA.  I haven't traced the coercion code to know why they aren't 
identical, but I think it doesn't matter:  the FPU and SSE methods are 
inconsistent for most pairs of operands.

All of which leads me to conclude that we should either say we don't 
guarantee what happens when you mix NA with NaN, or we should add an 
explicit test to every operation.  I'd prefer the "no guarantees" solution.

Duncan Murdoch

> 
>     > adding NA and NaN:
>     >> NA_real_ + NaN
>     > [1] NaN
>     >> NaN + NA_real_
>     > [1] NA
>     > If I compile src/main/arithmetic.c without optimization (just -g)
>     > then both of those return NA.
> 
>     > On Windows, using a precompiled R 2.8.1 from CRAN I get
>     > NA for both answers.
> 
>     > On Linux, after compiling src/main/arithmetic.c with -g -O2 the bit
>     > patterns for NA_real_ and as.numeric(NA) are different:
>     >> my_numeric_NA <- as.numeric(NA)
>     >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>     > 0000000 07a2 0000 0000 7ff8
>     > 0000010
>     >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>     > 0000000 07a2 0000 0000 7ff0
>     > 0000010 
>     > On Linux, after compiling with -g the bit patterns for NA_real_
>     > and as.numeric(NA) are identical.
>     >> my_numeric_NA <- as.numeric(NA)
>     >> writeBin(my_numeric_NA, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>     > 0000000 07a2 0000 0000 7ff8
>     > 0000010
>     >> writeBin(NA_real_, ptmp<-pipe("od -x", open="wb"));close(ptmp)
>     > 0000000 07a2 0000 0000 7ff8
>     > 0000010
> 
>     > On Windows, using precompiled R 2.8.1 and cygwin/bin/od, both of those
>     > gave the 7ff8 version.
> 
>     > Is this confounding of NA and NaN of concern or does R not promise to
>     > keep NA and NaN distinct? 
> 
> Hmm, I'd say it *is* of some concern that "+" is not commutative
> in the narrow sense, even if I don't know what exactly "R promises".
> 
>     > I haven't followed all the macros, but it looks like arithmetic.c just
>     > does
>     > result[i]=x[i]+y[i]
>     > and lets the compiler/floating point unit decide what to do when x[i]
>     > and y[i]
>     > are different NaN values (NA is a NaN value).  I haven't looked at the C
>     > code
>     > for the initialization of NA_real_.  Adding explicit tests for NA-ness
>     > in the
>     > binary operators (as S+ does) adds a fairly significant cost.
> 
> Yes, I would be quite reluctant to add such
> tests, because such costs are to be expected.
> 
> Maybe we ("R" :-) should explicitly state that operations mixing
> NA & NaN give a result which is NA in the sense of fulfilling is.na(.) 
> but *not* promise anything further.
> 
> Martin Maechler, ETH Zurich
> 
>     > Bill Dunlap
>     > TIBCO Software Inc - Spotfire Division
>     > wdunlap tibco.com
> 
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel