[R] Bug or feature? sum(c(a, b, c)) != (a + b + c)

(Ted Harding) ted.harding at wlandres.net
Wed Aug 24 01:06:38 CEST 2011


On 23-Aug-11 22:20:48, Barry Rowlingson wrote:
> On Tue, Aug 23, 2011 at 8:17 PM, Daniel Lai <danlai at bccrc.ca> wrote:
>> Greetings all,
>>
>> I'm porting an algorithm from MATLAB to R, and noticed some minor
>> discrepancies in small decimal values using rowSums and colSums which
>> are
>> exacerbated after heavy iteration and log space transformation. This
>> was
>> rather perplexing as both programs claimed and appeared to use the
>> IEEE 754
>> standard for floating point arithmetic (confirmed with manual basic
>> operations). _After some tracing and testing, I've managed to isolated
>> a
>> minimal working example as follows:
>>
>> a = 0.812672
>> b = 0.916541
>> c = 0.797810
>> sum(c(a, b, c)) == (a + b + c)
>> [1] FALSE
> 
>  Its probably to do with the order of summations. With your a,b,c you
> get:
> 
>  > (a+b+c) == (c+b+a)
>  [1] TRUE
>  > (a+b+c) == (c+a+b)
>  [1] FALSE
> 
> shock horror, addition is not associative[1]. Lets investigate:
> 
>  > sum(c(a,b,c)) == c+a+b
>  [1] TRUE
>  > sum(c(a,b,c)) == a+c+b
>  [1] TRUE
> 
>  'sum' seems to get the same answer as adding the first and the third,
> then adding the second - explicitly:
> 
>  > sum(c(a,b,c)) == (a+c)+b
>  [1] TRUE
> 
> I'm not sure what it would do for four values in the sum. Have fun
> finding out. Does matlab similarly have a+b+c != c+b+a?
> 
> Barry
> 
> [1] or commutative or distributive or one of those -ives you learn one
> day in school. Too lazy to wikipedia it right now...

   'sum' seems to get the same answer as adding the first and the third,
   then adding the second

It seems to be a bit more subtle than that!

  sum(c(a,b,c))==sum(c(a,b,c))
  # [1] TRUE
  sum(c(a,b,c))==sum(c(a,c,b))
  # [1] TRUE
  sum(c(a,b,c))==sum(c(b,a,c))
  # [1] TRUE
  sum(c(a,b,c))==sum(c(b,c,a))
  # [1] TRUE
  sum(c(a,b,c))==sum(c(c,a,b))
  # [1] TRUE
  sum(c(a,b,c))==sum(c(c,b,a))
  # [1] TRUE

so that's TRUE for all 6 permutations. However (as before,
but all 6 this time):

  a+b+c == a+b+c
  # [1] TRUE
  a+b+c == a+c+b
  # [1] FALSE
  a+b+c == b+c+a
  # [1] TRUE
  a+b+c == b+a+c
  # [1] TRUE
  a+b+c == c+a+b
  # [1] FALSE
  a+b+c == c+b+a
  # [1] TRUE

So it looks as though sum(c(...)) is somwhow independent of
the order of its arguments, which implies that the summation
it does is not quite the addition corresponding to "+".

I now feel out of my depth, so hope someone else knows/can
find the truth about this!

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at wlandres.net>
Fax-to-email: +44 (0)870 094 0861
Date: 24-Aug-11                                       Time: 00:06:35
------------------------------ XFMail ------------------------------



More information about the R-help mailing list