[Rd] double in summary.c : isum

Sun Mar 24 16:01:58 CET 2013

On 13-03-23 10:20 AM, Matthew Dowle wrote:
> On 23.03.2013 12:01, Prof Brian Ripley wrote:
>> On 20/03/2013 12:56, Matthew Dowle wrote:
>>>
>>> Hi,
>>>
>>> Please consider the following :
>>>
>>>> x = as.integer(2^30-1)
>>> [1] 1073741823
>>>> sum(c(rep(x, 10000000), rep(-x,9999999)))
>>> [1] 1073741824
>>>
>>> Tested on 2.15.2 and a recent R-devel (r62132).
>>>
>>> I'm wondering if s in isum could be LDOUBLE instead of double, like
>>> rsum, to fix this edge case?
>>
>> No, because there is no guarantee that LDOUBLE differs from double
>> (and platform on which it does not).
>
> That's a reason for not using LDOUBLE at all isn't it? Yet src/main/*.c
> has 19 lines using LDOUBLE e.g. arithmetic.c and cum.c as well as
> summary.c.
>
> I'd assumed LDOUBLE was being used by R to benefit from long double (or
> equivalent) on platforms that support it (which is all modern Unix, Mac
> and Windows as far as I know). I do realise that the edge case wouldn't
> be fixed on platforms where LDOUBLE is defined as double.

I think the problem is that there are two opposing targets in R:  we 
want things to be as accurate as possible, and we want them to be 
consistent across platforms. Sometimes one goal wins, sometimes the 
other.  Inconsistencies across platforms give false positives in tests 
that tend to make us miss true bugs.  Some people think we should never 
use LDOUBLE because of that.  In other cases, the extra accuracy is so 
helpful that it's worth it.  So I think you'd need to argue that the 
case you found is something where the benefit outweighs the costs. 
Since almost all integer sums are done exactly with the current code, is 
it really worth introducing inconsistencies in the rare inexact cases?

Duncan Murdoch

>
> What have I misunderstood?
>
>>
>> Users really need to take responsibility for the numerical stability
>> of calcuations they attempt.  Expecting to sum 20 million large
>> numbers exactly is unrealistic.
>
> Trying to take responsibility, but you said no. Changing from double to
> LDOUBLE would mean that something that wasn't realistic, was then
> realistic (on platforms that support long double).
>
> And it would bring open source R into line with TERR, which gets the
> answer right, on 64bit Windows at least. But I'm not sure I should be as
> confident in TERR as I am in open source R because I can't see its
> source code.
>
>
>> There are cases where 64-bit integer accumulators would be
>> beneficial, and this is one.  Unfortunately C11 does not require them
>> but some optional moves in that direction are planned.
>>
>>>
>>> https://svn.r-project.org/R/trunk/src/main/summary.c
>>>
>>> Thanks,
>>> Matthew
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>