# [Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds

Henrik Bengtsson henrik.bengtsson at gmail.com
Sat Apr 1 07:51:24 CEST 2017

```On Fri, Mar 31, 2017 at 10:14 PM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
> From ?NA
>
>      Numerical computations using ‘NA’ will normally result in ‘NA’: a
>      possible exception is where ‘NaN’ is also involved, in which case
>      either might result.
>
> and ?NaN
>
>      Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’:
>      which of those two is not guaranteed and may depend on the R
>      platform (since compilers may re-order computations).
>
> fortunes::fortune(14) applies (yet again).

Thanks; I'm often happy to have contributed to some of the fortune
one of my own matrixStats NEWS has an entry go a few years back which
mentions "... incorrectly assumed that the value of prod(c(NaN, NA))
is uniquely defined.  However, as documented in help("is.nan"), it may
be NA or NaN depending on R system/platform."  I guess the joke is on
me - it's April 1st after all.

But, technically one could test for ISNA(x) for each element before
calculating the intermediate sum, but since that is a quite expensive
test it is not done and sum += x is performed "as is" on NA and NaN
(and -Inf and +Inf).  Is that correct?

/Henrik

>
>
> On 01/04/2017 04:50, Henrik Bengtsson wrote:
>>
>> In R 3.3.3, I observe the following on Ubuntu 16.04 (when building
>> from source as well as for the sudo apt r-base build):
>>
>>> x <- c(NA, NaN)
>>> mean(x)
>>
>> [1] NA
>>>
>>> mean(rev(x))
>>
>> [1] NaN
>>
>>> rowMeans(matrix(x, nrow = 1, ncol = 2))
>>
>> [1] NA
>>>
>>> rowMeans(matrix(rev(x), nrow = 1, ncol = 2))
>>
>> [1] NaN
>>
>>> .rowMeans(x, m = 1, n = 2)
>>
>> [1] NA
>>>
>>> .rowMeans(rev(x), m = 1, n = 2)
>>
>> [1] NaN
>>
>>> .rowSums(x, m = 1, n = 2)
>>
>> [1] NA
>>>
>>> .rowSums(rev(x), m = 1, n = 2)
>>
>> [1] NaN
>>
>>> rowSums(matrix(x, nrow = 1, ncol = 2))
>>
>> [1] NA
>>>
>>> rowSums(matrix(rev(x), nrow = 1, ncol = 2))
>>
>> [1] NaN
>>
>> I'd expect NA to trump NaN in all cases (with na.rm = FALSE).  sum()
>> does not have this problem and returns NA in both cases (*).
>>
>> For the same R version build from source on RHEL 6.6 system
>> (completely different architecture), I get the expected result (= NA)
>> for all of the above cases, e.g.
>>
>>> x <- c(NA, NaN)
>>> mean(x)
>>
>> [1] NA
>>>
>>> mean(rev(x))
>>
>> [1] NA
>> [...]
>>
>> Before going insane trying to troubleshoot this, I have a vague memory
>> that this, or something related to this, has been discussed
>> previously, but I cannot locate it.
>>
>> Is the above a bug in R, a FAQ, a build error, overzealous compiler
>> optimization, and / or ...?
>>
>> Thanks,
>>
>> Henrik
>
>
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

```