[Rd] Accuracy: Correct sums in rowSums(), colSums() (PR#6196)

Tue Dec 30 09:30:21 MET 2003

Could you please prepare a patch against the current R-devel sources?
1.7.1 is several versions old.

I was puzzled as to why you are recommending improving the accuracy of the 
`fast' methods and not the slow one, applying apply() to mean() and sum().
The improvement might be larger in those cases: what Goldberg calls 
`a dramatic improvement' is from N eps down to 2 eps, and that is for the 
worst case and is only dramatic if N is dramatically bigger than 2.  It is 
also not clear that it is effective on machines with extended-precision 
registers unless N is very large.

Some of us are well aware of the issues of floating point arithmetic, but 
R is normally used for statistical computations, where errors in the data 
(rounding, measurement etc) normally are orders of magnitude bigger than 
those in floating-point arithmetic.  So implementation effort has been 
spent on handling the statistical errors first.

On Tue, 30 Dec 2003 Nick.Efthymiou at schwab.com wrote:

> Full_Name: Nick Efthymiou
> Version: R1.5.0 and above
> OS: Red Hat Linux 
> Submission from: (NULL) (162.93.14.73)
> 
> 
> With the introduction of the functions rowSums(), colSums(), rowMeans() and
> colMeans() in R1.5.0, function "SEXP do_colsum(SEXP call, SEXP op, SEXP args,
> SEXP rho)" was added to perform the fast summations. We have an excellent
> opportunity to improve the accuracy by implementing Kahan summation here.
> 
> Kahan summation is described in 
> 
> @article{ goldberg91what,
>     author = "David Goldberg",
>     title = "What Every Computer Scientist Should Know About Floating-Point
> Arithmetic",
>     journal = "ACM Computing Surveys",
>     volume = "23",
>     number = "1",
>     pages = "5--48",
>     year = "1991",
>     url = "citeseer.nj.nec.com/goldberg91what.html" }
> 
> 
> (http://citeseer.nj.nec.com/goldberg91what.html)
> 
> and in the article "Floating-point Summation", by Evan Manning, in C/C++ User's
> Journal, Sept. 1996.
> 
> The proposed improvement has been tested in !IEEE_754 mode, the patch is
> attached.
> It is intended to be applied against R-1.7.1/src/main/array.c
> 
> --------- Cut here ----------
> *** array.c.old	Mon Dec 15 17:33:23 2003
> --- array.c	Mon Dec 15 17:33:45 2003
> ***************
> *** 1016,1022 ****
>       int OP, n, p, cnt = 0, i, j, type;
>       Rboolean NaRm, keepNA;
>       int *ix;
> !     double *rx, sum = 0.0;
>   
>       checkArity(op, args);
>       x = CAR(args); args = CDR(args);
> --- 1016,1022 ----
>       int OP, n, p, cnt = 0, i, j, type;
>       Rboolean NaRm, keepNA;
>       int *ix;
> !     double *rx, sum = 0.0, correction = 0.0;
>   
>       checkArity(op, args);
>       x = CAR(args); args = CDR(args);
> ***************
> *** 1046,1062 ****
>   	    switch (type) {
>   	    case REALSXP:
>   		rx = REAL(x) + n*j;
>   #ifdef IEEE_754
>   		if (keepNA)
> ! 		    for (sum = 0., i = 0; i < n; i++) sum += *rx++;
>   		else {
> ! 		    for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++)
> ! 			if (!ISNAN(*rx)) {cnt++; sum += *rx;}
>   			else if (keepNA) {sum = NA_REAL; break;}
>   		}
>   #else
> ! 		for (cnt = 0, sum = 0., i = 0; i < n; i++, rx++)
> ! 		    if (!ISNAN(*rx)) {cnt++; sum += *rx;}
>   		    else if (keepNA) {sum = NA_REAL; break;}
>   #endif
>   		break;
> --- 1046,1082 ----
>   	    switch (type) {
>   	    case REALSXP:
>   		rx = REAL(x) + n*j;
> + 		sum = 0.;
>   #ifdef IEEE_754
>   		if (keepNA)
> ! 		    for (i = 0; i < n; i++) sum += *rx++;
>   		else {
> ! 		    for (cnt = 0, i = 0; i < n; i++, rx++)
> ! 			if (!ISNAN(*rx)) {
> ! 				/* TODO: Kahan summation */
> ! 				cnt++; sum += *rx;
> ! 			}
>   			else if (keepNA) {sum = NA_REAL; break;}
>   		}
>   #else
> ! 		i = cnt = 0;
> ! 		if (i < n) {
> ! 		    if (ISNAN(*rx)) {
> ! 			if (keepNA) {sum = NA_REAL; break /* switch */;}
> ! 		    } else {
> ! 			cnt++; sum = *rx;
> ! 		    }
> ! 		    i++; rx++;
> ! 		}
> ! 		for (; i < n; i++, rx++)
> ! 		    if (!ISNAN(*rx)) {
> ! 			/* Kahan summation */
> ! 			double yhilo = *rx - correction;
> ! 			double tsum = sum + yhilo;
> ! 			correction = (tsum - sum) - yhilo;
> ! 			sum = tsum;
> ! 			cnt++;
> ! 		    }
>   		    else if (keepNA) {sum = NA_REAL; break;}
>   #endif
>   		break;
> ***************
> *** 1121,1137 ****
>   	    switch (type) {
>   	    case REALSXP:
>   		rx = REAL(x) + i;
>   #ifdef IEEE_754
>   		if (keepNA)
> ! 		    for (sum = 0., j = 0; j < p; j++, rx += n) sum += *rx;
>   		else {
> ! 		    for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n)
> ! 			if (!ISNAN(*rx)) {cnt++; sum += *rx;}
>   			else if (keepNA) {sum = NA_REAL; break;}
>   		}
>   #else
> ! 		for (cnt = 0, sum = 0., j = 0; j < p; j++, rx += n)
> ! 		    if (!ISNAN(*rx)) {cnt++; sum += *rx;}
>   		    else if (keepNA) {sum = NA_REAL; break;}
>   #endif
>   		break;
> --- 1141,1177 ----
>   	    switch (type) {
>   	    case REALSXP:
>   		rx = REAL(x) + i;
> + 		sum = 0.;
>   #ifdef IEEE_754
>   		if (keepNA)
> ! 		    for (j = 0; j < p; j++, rx += n) sum += *rx;
>   		else {
> ! 		    for (cnt = 0, j = 0; j < p; j++, rx += n)
> ! 			if (!ISNAN(*rx)) {
> ! 				/* TODO: Kahan summation */
> ! 				cnt++; sum += *rx;
> ! 			}
>   			else if (keepNA) {sum = NA_REAL; break;}
>   		}
>   #else
> ! 		j = cnt = 0;
> ! 		if (j < p) {
> ! 		    if (ISNAN(*rx)) {
> ! 			if (keepNA) {sum = NA_REAL; break /* switch */;}
> ! 		    } else {
> ! 			cnt++; sum = *rx;
> ! 		    }
> ! 		    j++; rx += n;
> ! 		}
> ! 		for (; j < p; j++, rx += n)
> ! 		    if (!ISNAN(*rx)) {
> ! 			/* Kahan summation */
> ! 			double yhilo = *rx - correction;
> ! 			double tsum = sum + yhilo;
> ! 			correction = (tsum - sum) - yhilo;
> ! 			sum = tsum;
> ! 			cnt++;
> ! 		    }
>   		    else if (keepNA) {sum = NA_REAL; break;}
>   #endif
>   		break;
> ---------end of patch -------
> 
> 
> Since someone is certainly wondering why I bothered to bring this up - I was
> trying to optimize the order of a Chebyshev approximation to a specific function
> so that I accomplish the most power-efficient calculation at one ulp accuracy
> (multiplications and additions consume battery power). The optimization was
> based on minimizing the relative error between the true function and the
> approximation. It involved a lot of sums and was not converging properly. With
> the patch, I got the convergence I needed.
> 
> Thanks,
> 
> - Nick -
> 
> ______________________________________________
> R-devel at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595