[Rd] Computing means, variances and sums
Prof Brian Ripley
ripley at stats.ox.ac.uk
Wed Feb 22 09:52:13 CET 2006
I've managed to track this down. The setting of the FPU control word on a
ix86 machine changes the precision of (gcc) long double calculations in
the FPU, as well as those of double. So if it gets changed to PC_53, long
doubles lose accuracy even though 10 bytes are carried around.
For some discussion of extended-precision issues, see Priest's annex to
Goldberg's paper at http://www.validlab.com/goldberg/paper.pdf.
Many other OSes other than Windows on ix86 do consider the FPU control
word when doing context-switching. There is (often heated) debate as to
what the correct default should be, see e.g. the thread begining
http://gcc.gnu.org/ml/gcc/1998-12/msg00473.html
Whereas Linux has selected extended precision, apparently FreeBSD selected
53-bit mantissa, and Solaris x86 used the equivalent of -ffloat-store to
ensure stricter compliant to the IEEE754 'double' model (and cynics say,
to make the Sparc Solaris performance look good).
It will be interesting to see what MacIntel has selected (I suspect the
FreeBSD solution).
This does mean that using long doubles for accumulators is not necessarily
always a win although the potential loss is likely to be small.
On Sun, 19 Feb 2006, Duncan Murdoch wrote:
> On 2/19/2006 3:18 PM, Prof Brian Ripley wrote:
>> On Sun, 19 Feb 2006, hadley wickham wrote:
>>
>>>> p.s. If my computations are correct, 0.2 = 0*/2 + 0/4 + 1/8 + 1/16 +
>>>> 0/32 + 0/64 + 1/128 + 1/256 + 0/512 + 0/1024 + 1/2048 + 1/4096 + ... =
>>>> 0.3333333333333h. Perhaps someone can extend this to an FAQ to help
>>>> explain finite precision arithmetic and rounding issues.
>>> This is drifting a bit off topic, but the other day I discovered this
>>> rather nice illustration of the perils of finite precision arithmetic
>>> while creating a contrast matrix:
>>>
>>>> n <- 13
>>>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>>>> rowSums(a)
>>> [1] 2.775558e-16 2.775558e-16 5.551115e-17 5.551115e-17 5.551115e-17
>>> [6] 5.551115e-17 0.000000e+00 -5.551115e-17 0.000000e+00 5.551115e-17
>>> [11] 1.110223e-16 1.665335e-16 2.220446e-16
>>>
>>> Not only do most of the rows not sum to 0, they do not even sum to the
>>> same number! It is hard to remember the familiar rules of arithmetic
>>> do not always apply.
>>
>> I think you will find this example does give all 0's in R-devel, even on
>> platforms like Sparc.
>
> Only until the fpu precision gets changed:
>
>> n <- 13
>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>> rowSums(a)
> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
>> RSiteSearch('junk')
> A search query has been submitted to http://search.r-project.org
> The results page should open in your browser shortly
>
>> n <- 13
>> a <- matrix(-1/n, ncol=n, nrow=n) + diag(n)
>> rowSums(a)
> [1] 2.775558e-16 2.775558e-16 5.551115e-17 5.551115e-17 5.551115e-17
> [6] 5.551115e-17 0.000000e+00 -5.551115e-17 0.000000e+00 5.551115e-17
> [11] 1.110223e-16 1.665335e-16 2.220446e-16
>
> We still need to protect against these changes. I'll put something together,
> unless you're already working on it.
>
> The approach I'm thinking of is to define a macro to be called in risky
> situations. On platforms where this isn't an issue, the macro would be null;
> on Windows, it would reset the fpu to full precision.
>
> For example, RSiteSearch causes damage in the ShellExecute call in
> do_shellexec called from browseURL, so I'd add protection there. I think we
> should also add detection code somewhere in the evaluation loop to help in
> diagnosing these problems.
>
>> But users do need to remember that computer arithmetic is inexact except in
>> rather narrowly delimited cases.
>
> Yes, that too.
>
> Duncan Murdoch
>
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list