[R] strange behavior of lchoose in combinatorics problem

Richard M. Heiberger rmh at temple.edu
Sun Jun 26 01:26:17 CEST 2016


Floating point numbers are rounded to 53 significant bits.  When you
use logs of integers you are using
floating point numbers.

> .3 + .6 == .9
[1] FALSE
> (.3 + .6) - .9
[1] -1.110223e-16

See FAQ 7.31 for more discussion.

On Sat, Jun 25, 2016 at 10:13 AM, Gonçalo Ferraz <gferraz29 at gmail.com> wrote:
> Hi,
>
> I am working on interactions between animals, studying whether animal 1 is attracted to animal 2 (or vice-versa). I looked for the two animals in N=2178 sampling occasions, finding animal 1 a total of N1=165 times, and animal 2 a total of N2=331 times. In J=97 occasions, I saw both animals at the same time.
>
> The more frequently I see the two animals in the same sampling occasion, the more I will believe that they are attracted to each other. Therefore, I want to calculate the probability of finding J<=97 when the two animals are moving around independently of each other. The higher this probability, the stronger the attraction.
>
> Following Veech (Journal of Biogeography 2014, 41: 1029-1035) I compute the log probability of obtaining a number n of encounters between animals as ‘lpn’ in the function lveech:
>
> lveech <- function(N,N1,N2,n) {
>     a <- lchoose(N,n)
>     b <- lchoose(N-n,N2-n)
>     c <- lchoose(N-N2,N1-n)
>     d <- lchoose(N,N2)
>     e <- lchoose(N,N1)
>     lpn <- (a+b+c)-(d+e)
>     return(lpn)
> }
>
> I have tested this function with shorter, intuitive numbers, and it works as expected. I use log probabilities to stay away from intractable numbers.
>
> Next, I use function lveech to obtain a vector ‘lpvec’ that gives me all the log probabilities of getting n=0,1,2,…,97 simultaneous observations of the two animals:
>
> lpvec <- rep(NA,J+1)
> for(i in 1:(J+1)) lpvec[i] <- lveech(N,N1,N2,nseq[i])
>
> PROBLEM: the sum of the probabilities in lpvec should be <=1, but it is not. The sum is something on the order of 1.48e-13. That happens whether I sum exp(lpvec) or use a function for summing probabilities in log-prob space. In most applications of lveech, the sum of the probabilities is <=1, as it should be, but occasionally I get this problem. Why should this be? Does anyone know of any issues with lchoose that could explain this?
>
> I want to solve this because if I round up the sum to 1 I will not be able to quantify the attraction between animals and compare it with the attractions between other pairs of animals in my data.
>
> Thank you so much for reading the long question.
>
> G.
>
>
>
>
>
>
>
>
> I have two binary vectors, v1 and v2, both with length N=2178. The sum of vector v1 is N1=165 and the sum of vector v2 is N2=331. I need to compute the probability that
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list