[Rd] Floating point maths in R

Tom McCallum tom.mccallum at levelelimited.com
Sat Dec 9 14:50:09 CET 2006


Many thanks for pointing that out.

Tom


On Sat, 09 Dec 2006 13:48:06 -0000, Peter Dalgaard  
<p.dalgaard at biostat.ku.dk> wrote:

> Tom McCallum wrote:
>> Hi,
>>
>> I am not sure if this is just me using R (R-2.3.1 and R-2.4.0) in the   
>> wrong way or if there is a more serious bug.  I was having problems   
>> getting some calculations to add up so I ran the following tests:
>>
>>
> Please read  FAQ 7.31 and the reference therein.
>
> http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>
> (short answer: You can not represent thirds exactly in decimal nor  
> tenths in binary.)
>>> (2.34567 - 2.00000) == 0.34567 <------- should be true
>>>
>> [1] FALSE
>>
>>> (2.23-2.00) == 0.23 <------- should be true
>>>
>> [1] FALSE
>>
>>> 4-2==2
>>>
>> [1] TRUE
>>
>>> (4-2)==2
>>>
>> [1] TRUE
>>
>>> (4.0-2)==2
>>>
>> [1] TRUE
>>
>>> (4.0-2.0)==2
>>>
>> [1] TRUE
>>
>>> (4.0-2.0)==2.0
>>>
>> [1] TRUE
>>
>>> (4.2-2.2)==2.0
>>>
>> [1] TRUE
>>
>>> (4.20-2.20)==2.00
>>>
>> [1] TRUE
>>
>>> (4.23-2.23)==2.00  <------- should be true
>>>
>> [1] FALSE
>>
>>> (4.230-2.230)==2.000 <------- should be true
>>>
>> [1] FALSE
>>
>>> (4.230-2.230)==2.00 <------- should be true
>>>
>> [1] FALSE
>>
>>> (4.230-2.23)==2.00 <------- should be true
>>>
>> [1] FALSE
>>
>> I have tried these on both 64 and 32-bit machines.  Surely R should be   
>> able to do maths to 2 decimal places and be able to test these simple   
>> expressions?  The problem occurs as in the 16th decimal place junk is   
>> being placed by the FPU it seems.  I have also tried:
>>
>>
>>> (4.2300000000000000-2.230000000000000) == 2
>>>
>> [1] FALSE
>>
>>> a <- (4.2300000000000000-2.230000000000000)
>>> a == 2
>>>
>> [1] FALSE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000000
>>>
>> [1] FALSE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004 <--  
>>> correct  when add 16th decimal place to 4
>>>
>> [1] TRUE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.00000000000000043  <--  
>>> any  values after the 16th decimal place mean that the expression is  
>>> true
>>>
>> [1] TRUE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.000000000000000435
>>>
>> [1] TRUE
>>
>> Also :
>>
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000001
>>>
>> [1] FALSE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000003
>>>
>> [1] TRUE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000004
>>>
>> [1] TRUE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000005
>>>
>> [1] TRUE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000006 <-- 3,5  
>>> I  can understand being true if rounding occurring, but 6?
>>>
>> [1] TRUE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000007
>>>
>> [1] FALSE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000008
>>>
>> [1] FALSE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000009
>>>
>> [1] FALSE
>>
>>> (4.2300000000000000-2.230000000000000) == 2.0000000000000010
>>>
>>
>>
>> This is an example of junk being added in the FPU
>>
>>> formatC(a, digits=20)
>>>
>> [1] "2.0000000000000004441"
>>
>> I don't know if this is just a formatC error when using more than 16   
>> decimal places or if this junk is what is stopping the equality from  
>> being  true:
>>
>>
>>> formatC(a, digits=16)
>>>
>> [1] "                2"
>>
>>> formatC(a, digits=17)  <-- 16 decimal places, 17 significant figures   
>>> shown
>>>
>> [1] "2.0000000000000004" <-- the problem is the 4 at the end
>>
>> Obviously the bytes are divided between the exponent and mantissa in   
>> 16-16bit share it seems, but this doesn't account for the 16th decimal   
>> place behaviour does it?
>>
>> If any one has a work around or reason why this should occur it would  
>> be  useful to know.
>>
>> what I would like is to be able to do sums such as (2.3456 - 2 ) ==  
>> 0.3456  and get a sensible answer - any suggestions?  Currently the  
>> only way is  for formatC the expression to a known number of decimal  
>> places - is there  a better way?
>>
>> Many thanks
>>
>> Tom
>>
>>
>>
>



-- 
Dr. Thomas McCallum
Systems Architect,
Level E Limited
ETTC, The King's Buildings
Mayfield Road,
Edinburgh EH9 3JL, UK
Work  +44 (0) 131 472 4813
Fax:  +44 (0) 131 472 4719
http://www.levelelimited.com
Email: tom at levelelimited.com

Level E is a limited company incorporated in Scotland. The c...{{dropped}}



More information about the R-devel mailing list