[Rd] bug in sum() on integer vector

Wed Dec 14 02:43:47 CET 2011

Hi Ted,

On 11-12-13 04:52 PM, (Ted Harding) wrote:
[...]
> Now, computer programs for numerical computation can broadly
> be divided into two types.
>
> In one, "arbitrary precision" is available: you can tell
> the program how many decimal digits you want it to work to.
> An example of this is 'bc':
>
>    http://en.wikipedia.org/wiki/Bc_programming_language
>
> You can set as many decimal ditgits as you like, *provided*
> they fall within the storage capacity of your computer, for
> which an upper bound is the storage capacity of the Universe
> (see above). For integers and results which surpass the
> decimal places you have set, the result will be an approximation.
> Inevitably.

AFAICT, with bc and other tools doing arithmetic on arbitrary large
integers, operations like +, -, *, ^ etc either give the exact answer
or they fail. That's the beauty of those tools. Otherwise you could
call them "pointless" or "broken".

Arbitrary-precision vs fixed-precision is slightly off topics though.
In particular I didn't suggest that doing sum() on an integer vector
should use arbitrary large integers internally to do the computation.

Cheers,
H.

>
> In the other type, the program is written so as to embody
> integers to a fixed maximum number of decimal (or binary)
> digits. An example of this is R (and most other numerical
> programs). This may be 32 bits or 64 bits. Any result ot
> computation which involve smore than this numer of bits
> is inevitably an approximation.
>
> Provided the user is aware of this, there is no need for
> your "It should always return the correct value or fail."
> It will return the correct value if the integers are not
> too large; otherwise it will retuirn the best approximation
> that it can cope with in the fixed finite storage space
> for which it has been programmed.
>
> There is an implcit element of the arbitrary in this. You
> can install 32-bit R on a 64-bit-capable machine, or a
> 64-bit version. You could re-program R so that it can
> work to, say, 128 bits or 256 bits even on a 32-bit machine
> (using techniques like those that underlie 'bc'), but
> that would be an arbitrary choice. However, the essential
> point is that some choice is unavoidable, since if you push
> it too far the Universe will run out of particles -- and the
> computer industry will run out of transistors long before
> you hit the Universe limit!
>
> So you just have to accept the limits. Provided you are aware
> of the approximations which may set in at some point, you can
> cope with the consequences, so long as you take account of
> some concept of "adequacy" in the inevitable approximations.
> Simply to "fail" is far too unsophisticated a result!
>
> Hoping this is useful,
> Ted.
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding)<ted.harding at wlandres.net>
> Fax-to-email: +44 (0)870 094 0861
> Date: 14-Dec-11                                       Time: 00:52:49
> ------------------------------ XFMail ------------------------------

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319