[R] bug?

Tue Jul 15 03:39:08 CEST 2003

Marc Vandemeulebroecke <vandemem at gmx.de> asked:
	is there a sensible explanation for the following behaviour?

	> seq(0.6, 0.9, by=0.1) == 0.8
	[1] FALSE FALSE  TRUE FALSE
	> seq(0.7, 0.9, by=0.1) == 0.8
	[1] FALSE FALSE FALSE

Yes.  It's called "floating-point arithmetic".  The problem is that
only computers using decimal floating-point arithmetic can represent
0.1 exactly; computers using binary floating-point can only represent
numbers of the form (whole number) * (power of 2) {plus other stuff you
probably don't want to know about, like NaNs, which aren't relevant here}.

Let's see what you got:
    > seq(0.6, 0.9, by=0.1) - 0.8
    [1] -0.2 -0.1  0.0  0.1
    > seq(0.7, 0.9, by=0.1) - 0.8 
    [1] -1.000000e-01 -1.110223e-16  1.000000e-01
                      ^^^^^^^^^^^^^

This difference isn't 0; it's about one unit in the last place.

The best way to work around this is only to use by=x when x is a
whole number times a power of two.  For example,

    > seq(6, 9, by=1)*0.1 == 0.8
    [1] FALSE FALSE  TRUE FALSE
    > seq(7, 9, by=1)*0.1 == 0.8
    [1] FALSE  TRUE FALSE

This is the reason why DO-loops with REAL control variables are
deprecated in Fortran 90; they often give you very nasty surprises.