[Rd] RFC: lchoose() vs lfactorial() etc

Tue Dec 15 10:52:43 CET 2009

On Tue, Dec 15, 2009 at 09:49:28AM +0100, Martin Maechler wrote:
> lgamma(x) and lfactorial(x) are defined to return
> 
>  ln|Gamma(x)| {= log(abs(gamma(x)))} or  ln|Gamma(x+1)| respectively.
> 
> Unfortunately,  we haven't chosen the analogous definition for 
> lchoose().
> 
> So, currently 
> 
>   > lchoose(1/2, 1:10)
>    [1] -0.6931472 -2.0794415        NaN -3.2425924        NaN -3.8869494
>    [7]        NaN -4.3357508        NaN -4.6805913
>   Warning message:
>   In lchoose(n, k) : NaNs produced
>   > 
> 
> which (the NaN's) is not particularly useful.
> (I have use case where I really have to workaround those NaNs.)
> 
> I herebey propose to *amend* the definition of lchoose() such
> that it behaves analogously to  lgamma() and lfactorial(),
> i.e., to return
> 
>    log(abs(choose(.,.))
> 
> Your comments are welcome.
> Martin Maechler, ETH Zurich

I do not have a strong opinion, whether lchoose() should be log(choose(.,.))
or log(abs(choose(.,.))), although i would slightly prefere the latter (with abs()).
One of the reasons is that this would simplify the code of the function
lchoose() in src/nmath/choose.c. It produces the NA by explicit testing
the sign, so this test could be simply removed.

Let me also point out that the current implementation seems to contain
a bug, which causes that it is neither log(choose(.,.)) nor log(abs(choose(.,.))).

  x <- choose(1/2, 1:10)
  y <- lchoose(1/2, 1:10)
  cbind(choose=x, lchoose=y, log.choose=log(abs(x)))
            choose    lchoose log.choose
  #  [1,]  0.500000000 -0.6931472 -0.6931472
  #  [2,] -0.125000000 -2.0794415 -2.0794415
  #  [3,]  0.062500000        NaN -2.7725887
  #  [4,] -0.039062500 -3.2425924 -3.2425924
  #  [5,]  0.027343750        NaN -3.5992673
  #  [6,] -0.020507812 -3.8869494 -3.8869494
  #  [7,]  0.016113281        NaN -4.1281114
  #  [8,] -0.013092041 -4.3357508 -4.3357508
  #  [9,]  0.010910034        NaN -4.5180723
  # [10,] -0.009273529 -4.6805913 -4.6805913

So, besides x[1], we get NA for those cases, when choose() is positive.
The reason seems to be the test at line 89 of src/nmath/choose.c, which is

    if (fmod(floor(n-k+1), 2.) == 0) /* choose() < 0 */
        return ML_NAN;

but it should be negated. I tested it using

    if (fmod(floor(n-k+1), 2.) != 0) /* choose() < 0 */
        return ML_NAN;

when we get

            choose    lchoose log.choose
   [1,]  0.500000000 -0.6931472 -0.6931472
   [2,] -0.125000000        NaN -2.0794415
   [3,]  0.062500000 -2.7725887 -2.7725887
   [4,] -0.039062500        NaN -3.2425924
   [5,]  0.027343750 -3.5992673 -3.5992673
   [6,] -0.020507812        NaN -3.8869494
   [7,]  0.016113281 -4.1281114 -4.1281114
   [8,] -0.013092041        NaN -4.3357508
   [9,]  0.010910034 -4.5180723 -4.5180723
  [10,] -0.009273529        NaN -4.6805913

Petr Savicky.