[Rd] Discrepancy: R sum() VS C or Fortran sum

Fri Mar 16 16:29:16 CET 2018

Pierre,

It's comforting to think that the same simple summation loop implemented in
R, C, and Fortran would give bit-wise identical results, but that isn't the
case in practice.  Floating-point results depend on lots of things: the
chip used, the compiler used, the optimization flags, etc.  For example, in
the unlikely event that you run this on a 32-bit machine, you have the
messy 80-bit internal precision used by the double precision hardware on
the 32-bit platform to consider.  This is enough to derail the "bit-wise
equivalence train" right away.  There are plenty of other such issues that
can rise up and play a role.

Of course, there may be something about R's double precision evaluation
that is wrong, but IMHO a failure to be bit-wise identical with a
computation done elsewhere isn't enough to say a bug has been turfed up.
IMHO it's not unusual for computations to differ by the least significant
bit or two.  If that is enough to derail your computations for the
Jacobians and Hessians, that is at least a teachable moment for your
numerical methods class you have.

BTW, are you using finite difference techniques for the derivatives or
computational derivatives?  The latter techniques are not so sensitive to
round-off errors: I would expect them to be as accurate as symbolic
derivatives.

HTH,

-Steve

On Fri, Mar 16, 2018 at 10:58 AM, Pierre Chausse <pchausse at uwaterloo.ca>
wrote:

> Hi all,
>
> I found a discrepancy between the sum() in R and either a sum done in C or
> Fortran for vector of just 5 elements. The difference is very small, but
> this is a very small part of a much larger numerical problem in which first
> and second derivatives are computed numerically. This is part of a
> numerical method course I am teaching in which I want to compare speeds of
> R versus Fortran (We solve a general equilibrium problem all numerically,
> if you want to know). Because of this discrepancy, the Jacobian and Hessian
> in R versus in Fortran are quite different, which results in the Newton
> method producing a different solution (for a given stopping rule). Since
> the solution computed in Fortran is almost identical to the analytical
> solution, I suspect that the sum in Fortran may be more accurate (That's
> just a guess).  Most of the time the sum produces identical results, but
> for some numbers, it is different. The following example, shows what
> happens:
>
> set.seed(12233)
> n <- 5
> a <- runif(n,1,5)
> e <- runif(n, 5*(1:n),10*(1:n))
> s <- runif(1, 1.2, 4)
> p <- runif(5, 3, 10)
> x <- c(e[-5], (sum(e*p)-sum(e[-5]*p[-5]))/p[5])
> u <- a^(1/s)*x^((s-1)/s)
> dyn.load("sumF.so")
>
> u[1] <- u[1]+.0001 ### If we do not add .0001, all differences are 0
> s1 <- sum(u)
> s2 <- .Fortran("sumf", as.double(u), as.integer(n), sf1=double(1),
>                sf2=double(1))[3:4]
> s3 <- .C("sumc", as.double(u), as.integer(n), sC=double(1))[[3]]
>
> s1-s2[[1]] ## R versus compiler sum() (Fortran)
>
> [1] -7.105427e-15
>
> s1-s2[[2]] ## R versus manual sum (Fortran
>
> [1] -7.105427e-15
>
> s1-s3 ## R Versus manual sum in C
>
> [1] -7.105427e-15
>
> s2[[2]]-s2[[1]] ## manual sum versus compiler sum() (Fortran)
>
> [1] 0
>
> s3-s2[[2]] ## Fortran versus C
>
> [1] 0
>
> My sumf and sumc are
>
>       subroutine sumf(x, n, sx1, sx2)
>       integer i, n
>       double precision x(n), sx1, sx2
>       sx1 = sum(x)
>       sx2 = 0.0d0
>       do i=1,n
>          sx2 = sx2+x(i)
>       end do
>       end
>
> void sumc(double *x, int *n, double *sum)
> {
>   int i;
>   double sum1 = 0.0;
>   for (i=0; i< *n; i++) {
>     sum1 += x[i];
>   }
>   *sum = sum1;
> }
>
> Can that be a bug?  Thanks.
>
> --
> Pierre Chaussé
> Assistant Professor
> Department of Economics
> University of Waterloo
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Steven Dirkse, Ph.D.
GAMS Development Corp.
office: 202.342.0180

	[[alternative HTML version deleted]]