[Rd] double in summary.c : isum

Matthew Dowle mdowle at mdowle.plus.com
Mon Mar 25 12:27:38 CET 2013

On 25.03.2013 09:20, Prof Brian Ripley wrote:
> On 24/03/2013 15:01, Duncan Murdoch wrote:
>> On 13-03-23 10:20 AM, Matthew Dowle wrote:
>>> On 23.03.2013 12:01, Prof Brian Ripley wrote:
>>>> On 20/03/2013 12:56, Matthew Dowle wrote:
>>>>> Hi,
>>>>> Please consider the following :
>>>>>> x = as.integer(2^30-1)
>>>>> [1] 1073741823
>>>>>> sum(c(rep(x, 10000000), rep(-x,9999999)))
>>>>> [1] 1073741824
>>>>> Tested on 2.15.2 and a recent R-devel (r62132).
>>>>> I'm wondering if s in isum could be LDOUBLE instead of double, 
>>>>> like
>>>>> rsum, to fix this edge case?
>>>> No, because there is no guarantee that LDOUBLE differs from double
>>>> (and platform on which it does not).
>>> That's a reason for not using LDOUBLE at all isn't it? Yet 
>>> src/main/*.c
>>> has 19 lines using LDOUBLE e.g. arithmetic.c and cum.c as well as
>>> summary.c.
>>> I'd assumed LDOUBLE was being used by R to benefit from long double 
>>> (or
>>> equivalent) on platforms that support it (which is all modern Unix, 
>>> Mac
>>> and Windows as far as I know). I do realise that the edge case 
>>> wouldn't
> Actually, you don't know.  Really only on almost all Intel ix86: most
> other current CPUs do not have it in hardware.  C99/C11 require long
> double, but does not require the accuracy that you are thinking of 
> and
> it can be implemented in software.

This is very interesting, thanks. Which of the CRAN machines don't 
support LDOUBLE with higher accuracy than double, either in hardware or 
software?  Yes I had assumed that all CRAN machines would do. It would 
be useful to know for something else I'm working on as well.

>>> be fixed on platforms where LDOUBLE is defined as double.
>> I think the problem is that there are two opposing targets in R:  we
>> want things to be as accurate as possible, and we want them to be
>> consistent across platforms. Sometimes one goal wins, sometimes the
>> other.  Inconsistencies across platforms give false positives in 
>> tests
>> that tend to make us miss true bugs.  Some people think we should 
>> never
>> use LDOUBLE because of that.  In other cases, the extra accuracy is 
>> so
>> helpful that it's worth it.  So I think you'd need to argue that the
>> case you found is something where the benefit outweighs the costs. 
>> Since
>> almost all integer sums are done exactly with the current code, is 
>> it
>> really worth introducing inconsistencies in the rare inexact cases?
> But as I said lower down, a 64-bit integer accumulator would be
> helpful, C99/C11 requires one at least that large and it is
> implemented in hardware on all known R platforms.  So there is a way
> to do this pretty consistently across platforms.

That sounds much better. Is it just a matter of changing s to be 
declared as uint64_t?

>> Duncan Murdoch
>>> What have I misunderstood?
>>>> Users really need to take responsibility for the numerical 
>>>> stability
>>>> of calcuations they attempt.  Expecting to sum 20 million large
>>>> numbers exactly is unrealistic.
>>> Trying to take responsibility, but you said no. Changing from 
>>> double to
>>> LDOUBLE would mean that something that wasn't realistic, was then
>>> realistic (on platforms that support long double).
>>> And it would bring open source R into line with TERR, which gets 
>>> the
>>> answer right, on 64bit Windows at least. But I'm not sure I should 
>>> be as
>>> confident in TERR as I am in open source R because I can't see its
>>> source code.
>>>> There are cases where 64-bit integer accumulators would be
>>>> beneficial, and this is one.  Unfortunately C11 does not require 
>>>> them
>>>> but some optional moves in that direction are planned.
>>>>> https://svn.r-project.org/R/trunk/src/main/summary.c
>>>>> Thanks,
>>>>> Matthew
>>>>> ______________________________________________
>>>>> R-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list