[Rd] Calculation of e^{z^2/2} for a normal deviate z

Mon Jun 24 11:47:37 CEST 2019

Hi Martin,

Thanks for another tips -- it will be hard for me to comprehend fully your implementation of Rmpfr. I will look at these instead. I realised what I thought was a difficult problem turned to be a popular one. I managed to get a Python version here for information.

# http://bayesjumping.net/log-sum-exp-trick/
import numpy as np
def logSumExp(ns):
    max = np.max(ns)
    ds = ns - max
    sumOfExp = np.exp(ds).sum()
    return max + np.log(sumOfExp)

x = [100001000, 100002000]
logSumExp(x)

from scipy.misc import logsumexp
logsumexp(x)

import numpy as np;
prob1 = np.log(1e-50)
prob2 = np.log(2.5e-50)
prob12 = np.logaddexp(prob1, prob2)
prob12
np.exp(prob12)

As expected they are in good agreement with R.

Best regards,

Jing Hua

________________________________
From: Martin Maechler <maechler using stat.math.ethz.ch>
Sent: 24 June 2019 09:29
To: jing hua zhao
Cc: William Dunlap; Martin Maechler; r-devel using r-project.org
Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z

>>>>> jing hua zhao
>>>>>     on Mon, 24 Jun 2019 08:51:43 +0000 writes:

    > Hi All,
    > Thanks for all your comments which allows me to appreciate more of these in Python and R.

    > I just came across the matrixStats package,

    > ## EXAMPLE #1
    > lx <- c(1000.01, 1000.02)
    > y0 <- log(sum(exp(lx)))
    > print(y0) ## Inf

    > y1 <- logSumExp(lx)
    > print(y1) ## 1000.708

    > and

    >> ly <- lx*100000
    >> ly
    > [1] 100001000 100002000
    >> y1 <- logSumExp(ly)
    >> print(y1)
    > [1] 100002000
    >> logSumExp
    > function (lx, idxs = NULL, na.rm = FALSE, ...)
    > {
    > has_na <- TRUE
    > .Call(C_logSumExp, as.numeric(lx), idxs, as.logical(na.rm),
    > has_na)
    > }
    > <bytecode: 0x20c07a8>
    > <environment: namespace:matrixStats>

    > Maybe this is rather close?

Thank you  Jing Hua,

indeed the issue of sums of (very large or very small)
exponentials is a special case, that can well be treated
specially
 - as it is not so infrequent
 - via "obvious" simplifications can be implemented even more accurately

We (authors of the R package 'copula') have had a need for
these as well, in likelihood computation for Archimedean
copulas, and
have efficient R level implementations, both for your case  and
the -- even more delicate -- case of e.g., alternating signs of
exponential terms.

"Unfortunately", we had never exported the functions from the
package, so you'd need

     copula:::lsum()   # log          sum {of exponentials in log scale}
or
     copula:::lssum()  # log *s*igned sum {of exponentials in log scale}

for the 2nd case.

The advantage is it's simple R code implemented quite
efficiently for the vector and matrices cases,

Source code from source file copula/R/special-func.R

 (in svn at R-forge :
   -->
   https://r-forge.r-project.org/scm/viewvc.php/pkg/copula/R/special-func.R?view=markup&root=copula )

{Yes, this GPL-2 licenced  {with Copyright, see file, please
                         keep this one line}
## Copyright (C) 2012 Marius Hofert, Ivan Kojadinovic, Martin Maechler, and Jun Y
}

----------------------------------------------------------------------

##' Properly compute log(x_1 + .. + x_n) for a given (n x d)-matrix of n row
##' vectors log(x_1),..,log(x_n) (each of dimension d)
##' Here, x_i > 0  for all i
##' @title Properly compute the logarithm of a sum
##' @param lx (n,d)-matrix containing the row vectors log(x_1),..,log(x_n)
##'        each of dimension d
##' @param l.off the offset to subtract and re-add; ideally in the order of
##'        the maximum of each column
##' @return log(x_1 + .. + x_n) [i.e., OF DIMENSION d!!!] computed via
##'         log(sum(x)) = log(sum(exp(log(x))))
##'         = log(exp(log(x_max))*sum(exp(log(x)-log(x_max))))
##'         = log(x_max) + log(sum(exp(log(x)-log(x_max)))))
##'         = lx.max + log(sum(exp(lx-lx.max)))
##'         => VECTOR OF DIMENSION d
##' @author Marius Hofert, Martin Maechler
lsum <- function(lx, l.off) {
    rx <- length(d <- dim(lx))
    if(mis.off <- missing(l.off)) l.off <- {
        if(rx <= 1L)
            max(lx)
        else if(rx == 2L)
            apply(lx, 2L, max)
    }
    if(rx <= 1L) { ## vector
        if(is.finite(l.off))
            l.off + log(sum(exp(lx - l.off)))
        else if(mis.off || is.na(l.off) || l.off == max(lx))
            l.off # NA || NaN or all lx == -Inf, or max(.) == Inf
        else
            stop("'l.off  is infinite but not == max(.)")
    } else if(rx == 2L) { ## matrix
        if(any(x.off <- !is.finite(l.off))) {
            if(mis.off || isTRUE(all.equal(l.off, apply(lx, 2L, max)))) {
                ## we know l.off = colMax(.)
                if(all(x.off)) return(l.off)
                r <- l.off
                iok <- which(!x.off)
                l.of <- l.off[iok]
                r[iok] <- l.of + log(colSums(exp(lx[,iok,drop=FALSE] -
                                                     rep(l.of, each=d[1]))))
                r
            } else ## explicitly specified l.off differing from colMax(.)
                stop("'l.off' has non-finite values but differs from default max(.)")
        }
        else
            l.off + log(colSums(exp(lx - rep(l.off, each=d[1]))))
    } else stop("not yet implemented for arrays of rank >= 3")
}

##' Properly compute log(x_1 + .. + x_n) for a given matrix of column vectors
##' log(|x_1|),.., log(|x_n|) and corresponding signs sign(x_1),.., sign(x_n)
##' Here, x_i is of arbitrary sign
##' @title compute logarithm of a sum with signed large coefficients
##' @param lxabs (d,n)-matrix containing the column vectors log(|x_1|),..,log(|x_n|)
##'        each of dimension d
##' @param signs corresponding matrix of signs sign(x_1), .., sign(x_n)
##' @param l.off the offset to subtract and re-add; ideally in the order of max(.)
##' @param strict logical indicating if it should stop on some negative sums
##' @return log(x_1 + .. + x_n) [i.e., of dimension d] computed via
##'         log(sum(x)) = log(sum(sign(x)*|x|)) = log(sum(sign(x)*exp(log(|x|))))
##'         = log(exp(log(x0))*sum(signs*exp(log(|x|)-log(x0))))
##'         = log(x0) + log(sum(signs* exp(log(|x|)-log(x0))))
##'         = l.off   + log(sum(signs* exp(lxabs -  l.off  )))
##' @author Marius Hofert and Martin Maechler
lssum <- function (lxabs, signs, l.off = apply(lxabs, 2, max), strict = TRUE) {
    stopifnot(length(dim(lxabs)) == 2L) # is.matrix(.) generalized
    sum. <- colSums(signs * exp(lxabs - rep(l.off, each=nrow(lxabs))))
    if(anyNA(sum.) || any(sum. <= 0))
        (if(strict) stop else warning)("lssum found non-positive sums")
    l.off + log(sum.)
}

----------------------------------------------------------------------

    > Best wishes,
    > Jing Hua

    > ________________________________
    > From: R-devel <r-devel-bounces using r-project.org> on behalf of Martin Maechler <maechler using stat.math.ethz.ch>
    > Sent: 24 June 2019 08:37
    > To: William Dunlap
    > Cc: r-devel using r-project.org
    > Subject: Re: [Rd] Calculation of e^{z^2/2} for a normal deviate z

>>>>> William Dunlap via R-devel
>>>>>     on Sun, 23 Jun 2019 10:34:47 -0700 writes:
>>>>> William Dunlap via R-devel
>>>>>     on Sun, 23 Jun 2019 10:34:47 -0700 writes:

    >> include/Rmath.h declares a set of 'logspace' functions for use at the C
    >> level.  I don't think there are core R functions that call them.

    >> /* Compute the log of a sum or difference from logs of terms, i.e.,
    >> *
    >> *     log (exp (logx) + exp (logy))
    >> * or  log (exp (logx) - exp (logy))
    >> *
    >> * without causing overflows or throwing away too much accuracy:
    >> */
    >> double  Rf_logspace_add(double logx, double logy);
    >> double  Rf_logspace_sub(double logx, double logy);
    >> double  Rf_logspace_sum(const double *logx, int nx);

    >> Bill Dunlap
    >> TIBCO Software
    >> wdunlap tibco.com

[[elided Hotmail spam]]

    > But they *have* been in use by core R functions for a long time
    > in pgamma, pbeta and related functions.

    > [and I have had changes in *hyper.c where logspace_add() is used too,
    > for several years now (since 2015) but I no longer know which
    > concrete accuracy problem that addresses, so have not yet committed it]

    > Martin Maechler
    > ETH Zurich and R Core Team

	[[alternative HTML version deleted]]