[R] Best way to compute a sum

Duncan Murdoch murdoch.duncan at gmail.com
Thu Jun 24 22:26:51 CEST 2010


On 24/06/2010 4:08 PM, Lasse Kliemann wrote:
>   > a <- 0 ; for(i in (1:200000000)) a <- a + 1/i
>   > b <- 0 ; for(i in (200000000:1)) b <- b + 1/i
>   > c <- sum(1/(1:200000000))
>   > d <- sum(1/(200000000:1))
>   > order(c(a,b,c,d))
>   [1] 1 2 4 3
>   > b<c
>   [1] TRUE
>   > c==d
>   [1] FALSE
>
> I'd expected b being the largest, since we sum up the smallest 
> numbers first. Instead, c is the largest, which is sum() applied 
> to the vector ordered with largest numbers first.
>
> Can anyone shed some light on this?
>   

I don't know why you'd expect b to be larger than the others.  When 
you're using sum(), you should expect the most accurate answer, but it 
won't necessarily be the biggest:  some rounding errors will make the 
answer bigger than it should be.  For example, if we only had whole 
number arithmetic, 0.6 + 0.6 + 0.6 might come out to 3, if each value 
was rounded to 1 before adding.

As to whether c or d comes out better:  you'd need to program it 
yourself if you care about getting the last bit right.
> What is the best way in R to compute a sum while avoiding 
> cancellation effects?
>   
Use sum().  If it's not good enough, then do it in C, accumulating in 
extended precision (which is what sum() does), from smallest to largest 
if all terms are positive.  If there are mixed signs it's a lot harder 
to optimize, but I believe the optimal order would be something like the 
one that keeps each partial sum as close as possible to zero.  It would 
rarely be practical to implement that, so summing from smallest absolute 
value to largest would be my recommendation.
> By the way, sum() in the above example is much faster than the 
> loops, so it would be nice if we could utilize it.
>   

Why do you think you can't?

Duncan Murdoch

> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list