# [Rd] dhyper, phyper (PR#10853)

tdye at hawaii.edu tdye at hawaii.edu
Wed Feb 27 09:05:07 CET 2008

```Aloha all,

I know too little about what I'm about to write and hope I'm not

For a class I'm teaching in archaeological data analysis, I'm trying
to put together a routine that calculates the so-called Petersen
index and, especially, confidence intervals for the index.  This was
introduced to archaeologists by N.R.J. Fieller and A. Turner in an
article in Journal of Archaeological Science (1982) called Number
Estimation in Vertebrate Samples.  They say that "calculation of
precise confidence intervals for population sizes is, in principle at
least, straightforward.  It involves calculation of cumulative
hypergeometric probabilities (i.e. the summation of probabilities
given by equation 3.1 of Seber, 1973)."  The reference is to G.A.F.
Seber's book, The Estimation of Animal Abundance.

I went to equation 3.1 and wrote a small function to sum its
probabilities, modeled after phyper() and taking the arguments in the
same order (the names have changed to suit the archaeological
situation):

> seber <- function(p,l,n,r)
>   {
>     y <- 0
>     for (x in 0:p)
>       y <- y + exp(lchoose(l,x) + lchoose(n-l,r-x) - lchoose(n,r))
>     y
>   }

When used in the larger routine, this yields results that very
closely approximate the results in Fieller and Turner's table 1.

I initially thought I could use the function phyper() for this
because, as I interpret the help files, this routine yields
cumulative hypergeometric probabilities.  But I'm finding that it
gives different results than seber().

I apologize if I am in too far over my head, but I am wondering if
this is a bug in dhyper/phyper?  Perhaps I have misunderstood what
phyper() actually does, or am calling it incorrectly?  Or, were
Fieller and Turner in error?

All the best,
Tom

Thomas Dye
Dean Hall 201, Tuesday 1:00-1:55 pm

```