[Rd] Non identical numerical results from R code vs C/C++ code?

Thomas Lumley tlumley at u.washington.edu
Fri Sep 10 18:38:25 CEST 2010


On Fri, 10 Sep 2010, Duncan Murdoch wrote:

> On 10/09/2010 7:07 AM, Renaud Gaujoux wrote:
>> Thank you Duncan for your reply.
>> 
>> Currently I am using 'double' for the computations.
>> What type should I use for extended real in my intermediate computations?
>
> I think it depends on compiler details.  On some compilers "long double" will 
> get it, but I don't think there's a standard type that works on all 
> compilers.  (In fact, the 80 bit type on Intel chips isn't necessarily 
> supported on other hardware.)  R defines LDOUBLE in its header files and it 
> is probably best to use that if you want to duplicate R results.

As a little more detail, 'long double' is in the C99 standard and seems to be fairly widely implemented, so code using it is likely to compile.   The Standard, as usual, doesn't define exactly what type it is, and permits it to be a synonym for 'double', so you may not get any extra precision.

On Intel chips it is likely to be the 80-bit type, but the Sparc architecture doesn't have any larger hardware type.  Radford Neal has recently reported much slower results on Solaris with long double, consistent with Wikipedia's statement that long double is sometimes a software-implemented 128-bit type on these systems.


>> The result will still be 'double' anyway right?
>
> Yes, you do need to return type double.
>
> Duncan Murdoch
>
>> 
>> 
>> 
>> On 10/09/2010 13:00, Duncan Murdoch wrote:
>>> On 10/09/2010 6:46 AM, Renaud Gaujoux wrote:
>>>> Hi,
>>>> 
>>>> suppose you have two versions of the same algorithm: one in pure R, the 
>>>> other one in C/C++ called via .Call().
>>>> Assuming there is no bug in the implementations (i.e. they both do the 
>>>> same thing), is there any well known reason why the C/C++ implementation 
>>>> could return numerical results non identical to the one obtained from the 
>>>> pure R code? (e.g. could it be rounding errors? please explain.)
>>>> Has anybody had a similar experience?
>>> R often uses extended reals (80 bit floating point values on Intel chips) 
>>> for intermediate values.  C compilers may or may not do that.
>>>> By not identical, I mean very small differences (< 2.4 e-14), but enough 
>>>> to have identical() returning FALSE. Maybe I should not bother, but I 
>>>> want to be sure where the differences come from, at least by mere 
>>>> curiosity.
>>>> 
>>>> Briefly the R code perform multiple matrix product; the C code is an 
>>>> optimization of those specific products via custom for loops, where 
>>>> entries are not computed in the same order, etc... which improves both 
>>>> memory usage and speed. The result is theoretically the same.
>>> Changing the order of operations will often affect rounding.  For example, 
>>> suppose epsilon is the smallest number such that 1 + epsilon is not equal 
>>> to 1.  Then 1 + (epsilon/2) + (epsilon/2) will evaluate to either 1 or 1 + 
>>> epsilon, depending on the order of computing the additions.
>>> 
>>> Duncan Murdoch
>>> 
>>>> Thank you,
>>>> Renaud
>>>> 
>> 
>>  
>> ###
>> UNIVERSITY OF CAPE TOWN 
>> This e-mail is subject to the UCT ICT policies and e-mail disclaimer 
>> published on our website at 
>> http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 
>> 21 650 4500. This e-mail is intended only for the person(s) to whom it is 
>> addressed. If the e-mail has reached you in error, please notify the 
>> author. If you are not the intended recipient of the e-mail you may not 
>> use, disclose, copy, redirect or print the content. If this e-mail is not 
>> related to the business of UCT it is sent by the sender in the sender's 
>> individual capacity.
>> 
>> ###
>>  
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle



More information about the R-devel mailing list