[Rd] qnbinom with small size is slow

Constantin Ahlmann-Eltze @rtjom31415 @end|ng |rom goog|em@||@com
Mon Aug 10 10:05:36 CEST 2020


Thanks Ben for verifying the issue. It is always reassuring to hear
when others can reproduce the problem.

I wrote a small patch that fixes the issue
(https://github.com/r-devel/r-svn/pull/11):

diff --git a/src/nmath/qnbinom.c b/src/nmath/qnbinom.c
index b313ce56b2..d2e8d98759 100644
--- a/src/nmath/qnbinom.c
+++ b/src/nmath/qnbinom.c
@@ -104,6 +104,7 @@ double qnbinom(double p, double size, double prob,
int lower_tail, int log_p)
     /* y := approx.value (Cornish-Fisher expansion) :  */
     z = qnorm(p, 0., 1., /*lower_tail*/TRUE, /*log_p*/FALSE);
     y = R_forceint(mu + sigma * (z + gamma * (z*z - 1) / 6));
+    y = fmax2(0.0, y);

     z = pnbinom(y, size, prob, /*lower_tail*/TRUE, /*log_p*/FALSE);

I used the https://github.com/r-devel/r-svn repo and its continuous
integration tools to check that it doesn't break any existing tests:
https://github.com/r-devel/r-svn/actions/runs/201327042

I have also requested a Bugzilla-account, but haven't heard anything back yet.

Best,
Constantin

Am Fr., 7. Aug. 2020 um 21:41 Uhr schrieb Ben Bolker <bbolker using gmail.com>:
>
>     I can reproduce this on
>
> R Under development (unstable) (2020-07-24 r78910)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Pop!_OS 18.04 LTS
>
>    In my opinion this is worth reporting, but discussing it here first
> was a good idea.  Many more people read this list than watch the bug
> tracker, so it will get more attention here; once the excitement has
> died down here (which might be almost immediately!), if no-one has
> already volunteered to post it to the bug tracker, request an account
> (as specified at https://www.r-project.org/bugs.html )
>
>    Thanks!
>
>     Ben Bolker
>
>
> For what it's worth it doesn't seem to be a threshold effect: approximately
>
> log10(time[seconds]) ~ -8 - log10(-size)
>
> over the range from 1e-6 to 1e-9
>
>
> ff <- function(x) {
>     system.time(qnbinom(0.5, mu=3, size=10^x))[["elapsed"]]
> }
> svec <- seq(-5,-9,by=-0.2)
> res <- lapply(svec, function(x) {
>      cat(x,"\n")
>      replicate(10,ff(x))
>      })
>
> dd <- data.frame(size=rep(svec,each=10),
>                   time=unlist(res))
> boxplot(log10(time)~size, dd)
> summary(lm(log10(time)~size, data=dd, subset=time>0))
>
>
>
>
> On 8/7/20 2:01 PM, Constantin Ahlmann-Eltze via R-devel wrote:
>
> > Hi all,
> >
> > I recently noticed that `qnbinom()` can take a long time to calculate
> > a result if the `size` argument is very small.
> > For example
> >     qnbinom(0.5, mu = 3, size = 1e-10)
> > takes ~30 seconds on my computer.
> >
> > I used gdb to step through the qnbinom.c implementation and noticed
> > that in line 106
> > (https://github.com/wch/r-source/blob/f8d4d7d48051860cc695b99db9be9cf439aee743/src/nmath/qnbinom.c#L106)
> > `y` becomes a very large negative number. Later in the function `y` is
> > (as far as I can see) only used as input for `pnbinom()` which is why
> > I would assume that it should be a non-negative integer.
> >
> > I was wondering if this behavior could be considered a bug and should
> > be reported on the bugzilla? I read the instructions at
> > https://www.r-project.org/bugs.html and wasn't quite sure, so I
> > decided to ask here first :)
> >
> > Best,
> > Constantin
> >
> >
> >
> >
> > PS: I tested the code with R 4.0.0 on macOS and the latest unstable
> > version using docker (https://github.com/wch/r-debug). The session
> > info is
> >> sessionInfo()
> > R Under development (unstable) (2020-08-06 r78973)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > Running under: Ubuntu 20.04 LTS
> >
> > Matrix products: default
> > BLAS:   /usr/local/RD/lib/R/lib/libRblas.so
> > LAPACK: /usr/local/RD/lib/R/lib/libRlapack.so
> >
> > locale:
> >   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> >   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> >   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_4.1.0
> >
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list