[R] The effect of tolerance in all.equal()

Tue Apr 25 11:44:06 CEST 2017

>>>>> Ashim Kapoor <ashimkapoor at gmail.com>
>>>>>     on Tue, 25 Apr 2017 14:02:18 +0530 writes:

    > Dear all,
    > I am not able to understand the interplay of absolute vs relative and
    > tolerance in the use of all.equal

    > If I want to find out if absolute differences between 2 numbers/vectors are
    > bigger than a given tolerance I would do:

    > all.equal(1,1.1,scale=1,tol= .1)

    > If I want to find out if relative differences between 2 numbers/vectors are
    > bigger than a given tolerance I would do :

    > all.equal(1,1.1,tol=.1)

    > ##################################################################################################################################

    > I can also do :

    > all.equal(1,3,tol=1)

    > to find out if the absolute difference is bigger than 1.But here I won't be
    > able to detect absolute differences smaller than 1 in this case,so I don't
    > think that this is a good way.

    > My query is: what is the reasoning behind all.equal returning the absolute
    > difference if the tolerance >= target and relative difference if tolerance
    > < target?
(above, it is    tol  >/<=  |target|  ie. absolute value)

The following are desiderata / restrictions :

1) Relative tolerance is needed to keep things scale-invariant
   i.e.,  all.equal(x, y)  and  all.equal(1000 * x, 1000 * y)
   should typically be identical for (almost) all (x,y).

   ==> "the typical behavior should use relative error tolerance"

2) when x or y (and typically both!) are very close to zero it
   is typically undesirable to keep relative tolerances (in the
   boundary case, they _are_ zero exactly, and "relative error" is undefined).
   E.g., for most purposes, 3.45e-15 and 1.23e-17 should be counted as
   equal to zero and hence to themselves.

1) and 2) are typically reconciled by switching from relative to absolute
when the arguments are close to zero (*).

The exact cutoff at which to switch from relative to absolute
(or a combination of the two) is somewhat arbitrary(*2) and for
all.equal() has been made in the 1980's (or even slightly
earlier?) when all.equal() was introduced into the S language at
Bell labs AFAIK. Maybe John Chambers (or Rick Becker or ...,
but they may not read R-help) knows more.
*2) Then, the choice for all.equal() is in some way "least arbitrary", 
    using c = 1 in the more general   tolerance >= c*|target|  framework.

*) There have been alternatives in "the (applied numerical
 analysis / algorithm) literature" seen in published algorithms,
 but I don't have any example ready.
 Notably some of these alternatives are _symmetric_ in (x,y)
 where all.equal() was designed to be asymmetric using names
 'target' and 'current'.

The alternative idea is along the following thoughts:

Assume that for "equality" we want _both_ relative and
absolute (e := tolerance) "equality"

   |x - y| < e (|x|+|y|)/2  (where you could use |y| or |x| 
      	       		     instead of their mean; all.equal()
      	       		     uses |target|)
   |x - y| < e * e1	     (where e1 = 1, or e1 = 10^-7..)

If you add the two inequalities you get

   |x - y| < e (e1 + |x+y|/2)

as check which is a "mixture" of relative and absolute tolerance.

With a somewhat long history, my gut feeling would nowadays
actually prefer this (I think with a default of e1 = e) - which
does treat x and y symmetrically.

Note that convergence checks in good algorithms typically check
for _both_ relative and absolute difference (each with its
tolerance providable by the user), and the really good ones for
minimization do  check for (approximate) gradients also being
close to zero - as old timers among us should have learned from
Doug Bates ... but now I'm really diverging.

Last but not least some  R  code at the end,  showing that the *asymmetric*
nature of all.equal() may lead to somewhat astonishing (but very
logical and as documented!) behavior.

Martin

    > Best Regards,
    > Ashim

> ## The "data" to use:
> epsQ <- lapply(seq(12,18,by=1/2), function(P) bquote(10^-.(P))); names(epsQ) <- sapply(epsQ, deparse); str(epsQ)
List of 13
 $ 10^-12  : language 10^-12
 $ 10^-12.5: language 10^-12.5
 $ 10^-13  : language 10^-13
 $ 10^-13.5: language 10^-13.5
 $ 10^-14  : language 10^-14
 $ 10^-14.5: language 10^-14.5
 $ 10^-15  : language 10^-15
 $ 10^-15.5: language 10^-15.5
 $ 10^-16  : language 10^-16
 $ 10^-16.5: language 10^-16.5
 $ 10^-17  : language 10^-17
 $ 10^-17.5: language 10^-17.5
 $ 10^-18  : language 10^-18

> str(lapply(epsQ, function(tl) all.equal(3.45e-15, 1.23e-17, tol = eval(tl))))
List of 13
 $ 10^-12  : logi TRUE
 $ 10^-12.5: logi TRUE
 $ 10^-13  : logi TRUE
 $ 10^-13.5: logi TRUE
 $ 10^-14  : logi TRUE
 $ 10^-14.5: chr "Mean relative difference: 0.9964348"
 $ 10^-15  : chr "Mean relative difference: 0.9964348"
 $ 10^-15.5: chr "Mean relative difference: 0.9964348"
 $ 10^-16  : chr "Mean relative difference: 0.9964348"
 $ 10^-16.5: chr "Mean relative difference: 0.9964348"
 $ 10^-17  : chr "Mean relative difference: 0.9964348"
 $ 10^-17.5: chr "Mean relative difference: 0.9964348"
 $ 10^-18  : chr "Mean relative difference: 0.9964348"

> ## Now swap `target` and `current` :
> str(lapply(epsQ, function(tl) all.equal(1.23e-17, 3.45e-15, tol = eval(tl))))
List of 13
 $ 10^-12  : logi TRUE
 $ 10^-12.5: logi TRUE
 $ 10^-13  : logi TRUE
 $ 10^-13.5: logi TRUE
 $ 10^-14  : logi TRUE
 $ 10^-14.5: chr "Mean absolute difference: 3.4377e-15"
 $ 10^-15  : chr "Mean absolute difference: 3.4377e-15"
 $ 10^-15.5: chr "Mean absolute difference: 3.4377e-15"
 $ 10^-16  : chr "Mean absolute difference: 3.4377e-15"
 $ 10^-16.5: chr "Mean absolute difference: 3.4377e-15"
 $ 10^-17  : chr "Mean relative difference: 279.4878"
 $ 10^-17.5: chr "Mean relative difference: 279.4878"
 $ 10^-18  : chr "Mean relative difference: 279.4878"

>