[R] Difference

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Tue Apr 19 11:43:08 CEST 2005


On 19-Apr-05 Ralf Strobl wrote:
> Dear List,
> can anyone explain me this result (Windows XP, R 2.0.1):
> 
>  > (0.2-0.1)==0.1
> [1] TRUE
>  > (0.3-0.2)==0.1
> [1] FALSE
> 
> Regards,
> Ralf Strobl

It is a consequence of the finite length of the binary
expression of decimal fractions, which is not exact
except for multiples of 1/2, 1/4, 1/8, 1/16 ... (just
as 1/3, 1/7 etc. are not exact in a decimal representation).

Example (though this is not how floating-point is done, which
is more complicated, and uses a different number of binary
places to what R uses, but it illustrates the above point):

1/10
.0001100110011001100110011001100110  [to 34 binary places]

2/10
.0011001100110011001100110011001100

3/10
.0100110011001100110011001100110011

3/10 - 2/10:
 .0100110011001100110011001100110011
-.0011001100110011001100110011001100
 -----------------------------------
=.0001100110011001100110011001100111

2/10 - 1/10
 .0011001100110011001100110011001100
-.0001100110011001100110011001100110
 -----------------------------------
=.0001100110011001100110011001100110

Not equal!

The difference is 1 in the last place, i.e. 2^(-34) in this
example. (2/10 - 1/10) is "right" (to the number of binary
places calculated), while (3/10 - 2/10) is "wrong".
(You might say that there's an inaccuracy in 2/10, since the
truncation has chopped off subsequent digits 110011...
which ought to round up the last digit of 2/10 to 1: but then
you'll see that in the subtractions this changes the last
digit of each result, so they're still different! This time,
(3/10 - 2/10) would be "right", and (2/10 - 1/10) "wrong".
In fact they're both wrong, all the time.)

Now have a look at your cases in R:

  (0.3 - 0.2) - (0.2 - 0.1)
  ##[1] -2.775558e-17

  2^(-55)
  ##[1] 2.775558e-17

which is the same phenomenon (though to a different number
of binary places).

You can't really escape from this entirely, though you can
in R cover up such inconsistencies by using "all.equal":

  all.equal(base)    Test if Two Objects are (Nearly) Equal

Thus:

  all.equal((0.3 - 0.2),(0.2 - 0.1))
  ##[1] TRUE

which is fine if that's what you want to do. But then it
may hide something that you might need to know:

  x<-1 ; y<-(x - 2^(-53))
  all.equal(x,y)
  ##[1] TRUE
  x==y
  ##[1] FALSE

Here you've deliberately made y different from x, but
"all.equal" hides the truth, while "==" shows it. These
tiny effects at the bottom end of the binary digits often
do not matter and never show up in "ordinary" calculations,
but on occasion they can lead to very delicate questions
of precision. For instance, if you calculate two values
x and y which might turn out to be slightly different (in
the above way), yet for later logical decisions you don't
want this to affect things, then you could do something
like

  if(all.equal(x,y)){y <- x}

which will make them identical, and for instance cause
a later test "if(x==y)" to return "TRUE". But then should
you have done {x <- y} instead? You have to think about
whether or not this will matter!

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 19-Apr-05                                       Time: 10:43:08
------------------------------ XFMail ------------------------------




More information about the R-help mailing list