[Rd] update on dnbinom with large "size"

Fri Jul 4 11:56:41 CEST 2008

On Fri, 4 Jul 2008, Martin Maechler wrote:

>>>>>> "BDR" == Prof Brian Ripley <ripley at stats.ox.ac.uk>
>>>>>>     on Fri, 4 Jul 2008 09:02:21 +0100 (BST) writes:
>
>    BDR> On Thu, 3 Jul 2008, Ben Bolker wrote:
>    >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>    >>
>    >>
>    >> ~ turns out I don't need to look at the C code.
>    >>
>    >> ~ if one uses the mu/size parameterization of the
>    >> negative binomial, R computes size/(size+mu) to switch
>    >> parameterizations.  If size>>mu this gets rounded to 1
>    >> ... should be easy enough to test and return NA under
>    >> these circumstances?
>
>    BDR> It is all vectorized, so not so easy.  But why is NA
>    BDR> appropriate, when a Poisson approximation seems more
>    BDR> appropriate?
>
> yes, definitely more appropriate.
>
>    BDR> The same issue affects [pqr]nbinom, of course.
>
> indeed.
>
>    BDR> The short-term advice is of course 'don't do that'
> :-)
>
>    BDR> but use a suitable approximation such as Poisson.  Maybe
>    BDR> there are reasons to support values of size > 1e10, but
>    BDR> I suspect only for completeness and maybe it is better
>    BDR> to at least give a warning.
>
> As you know I'm very much in favor of "completeness" here (and
> have already spent an unreasonable/unpayable amount of hours to
> achieve that in quite a few cases for R in the past).
>
> I'd much prefer a solution without a warning that used the
> appropriate approximation when needed.

The argument for warning is that one possible reason for doing this is to 
explore the difference between negative binomial and Poisson, and so it 
would be helpful to be warned that a Poisson had been used (as we warn in 
some other cases where accuracy is lost, if not entirely consistently).

> I'm now between a workshop and a week of vacation, so I am not
> promising anything, but I'd plan to get this fixed before the
> end of the summer.
>
>    >> - --------------------------
>    >> ~ Check it out:
>    >>
>    >> curve(dnbinom(1,mu=0.5,size=x),log="x",from=1,to=1e18)
>    >> abline(h=dpois(1,lambda=0.5),col=2,lty=2)
>    >> text(1,dpois(1,lambda=0.5)+0.02,"Poisson",col=2,pos=4)
>
> a very nice example, thank you, Ben!
>
> Regards, Martin
>
>    >>
>    >> ~ I will take a look in the C code when I get a chance to
>    >> see if I can offer a patch, but in the meantime wanted to
>    >> alert people to this "feature" ...  (it looks like I will
>    >> have to go through dnbinom and dbinom_raw to see where
>    >> the problem is ...)
>    >>
>    >> ~ Ben Bolker
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595