[R] Odp: precision issue?

(Ted Harding) Ted.Harding at manchester.ac.uk
Thu Mar 4 13:35:11 CET 2010


On 04-Mar-10 10:50:56, Petr PIKAL wrote:
> Hi
> 
> r-help-bounces at r-project.org napsal dne 04.03.2010 10:36:43:
>> Hi R Gurus,
>> 
>> I am trying to figure out what is going on here.
>> 
>> > a <- 68.08
>> > b <- a-1.55
>> > a-b
>> [1] 1.55
>> > a-b == 1.55
>> [1] FALSE
>> > round(a-b,2) == 1.55
>> [1] TRUE
>> > round(a-b,15) == 1.55
>> [1] FALSE
>> 
>> Why should (a - b) == 1.55 fail when in fact b has been defined
>> to be a - 1.55?  Is this a precision issue? How do i correct this?
> 
> In real world those definitions of b are the same but not in computer 
> world. See FAQ 7.31
> 
> Use either rounding or all.equal.
> 
>> all.equal(a-b, 1.55)
> [1] TRUE
> 
> To all, this is quite common question and it is documented in FAQs. 
> programs, therefore maybe a confusion from novices. 
> 
> I wonder if there could be some type of global option which will
> get rid of these users mistakes or misunderstandings by setting
> some threshold option for equality testing by use "==".
> 
> Regards
> Petr
> 
>> Alex

Interesting suggestion, but in my view it would probably give
rise to more problems than it would avoid!

The fundamental issue is that many inexperienced users are not
aware that once 68.08 has got inside the computer (as used by
R and other programs which do fixed-length binary arithmetic)
it is no longer 68.08 (though 1.55 is still 1.55).

Since "==" tests for equality of stored binary representations,
it is inevitable that it will often return FALSE when the user
would "logically" expect TRUE (as in Alex's query above). When
a naive users encounters this, and is led to raise a query on
the list, a successful reply will have the effect that yet one
more user has learned something. This is useful.

As the help page ?Comparison (also accessible from ?"==" etc.)
states:

  Do not use '==' and '!=' for tests, such as in 'if' expressions,
  where you must get a single 'TRUE' or 'FALSE'.  Unless you are
  absolutely sure that nothing unusual can happen, you should use
  the 'identical' function instead.

  For numerical and complex values, remember '==' and '!=' do not
  allow for the finite representation of fractions, nor for rounding
  error.  Using 'all.equal' with 'identical' is almost always
  preferable.  See the examples.

It can on occasion be useful to be able to test for exact equality
of internal binary representations. Also, what is really going on
when "==" appears to fail can be ascertained by evaluating the
within-computer difference between allegedly equivalent expressions.
Thus, for instance:

  a <- 68.08
  b <- a-1.55
  a-b == 1.55
  # [1] FALSE
  (a-b) - 1.55
  # [1] -2.88658e-15

  all.equal((a-b), 1.55)
  # [1] TRUE

I think that introducing a default "tolerance" option to "==",
thus making it masquerade as all.equal(), would both suppress
the capability of "==" to test exact equality of internal
representations, and contribute to persistence of misconceptions
by naive users. At least the current situation contributes to
removing the latter.

The result of the latter could well be that a complicated
program goes astray because a deep-lying use of "==" gives
the "equal within tolerance" result rather than the "not
exactly equal" result, leading to some really difficult
queries to the list!

Of course, the use of "all.equal((a-b), 1.55)" is a lot longer
than the use of "(a-b) == 1.55", and there is an understandable
argument for something as snappy as "==" for testing approximate
equality. But I think that subverting "==" is the wrong way to
go about it. Leave "==" alone and, according to taste, introduce
a new operator (which the user can define for himself anyway),
say "%==%":

  "%==%" <- function(x,y){ all.equal(x,y) }
  (a-b) %==% 1.55
  # [1] TRUE

Or perhaps

  "%==%" <- function(x,y){ identical(all.equal(x,y),True) }

as in the Example in ?Comparison. Then you have a snappy shorthand,
and it operates exactly as all.equal() does and with the same
tolerance, and it won't break anything else as a result of
subverting "==".

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 04-Mar-10                                       Time: 12:35:08
------------------------------ XFMail ------------------------------



More information about the R-help mailing list