[R] truncated negative binomial using rnegbin

Chris O'Brien obrienc at email.arizona.edu
Sat Feb 18 01:12:49 CET 2006


Dear R users,
I'm wanting to sample from the negative binomial distribution using the 
rnegbin function from the MASS library to create artificial samples for the 
purpose of doing some power calculations.  However, I would like to work 
with samples that come from a negative binomial distribution that includes 
only values greater than or equal to 1 (a truncated negative binomial), and 
I can't think of a straightforward way to accomplish this using rnegbin.

One suggestion I've received is that I use an iterative process to select 
numbers from the negative binomial, throw out the zeros, test the 
distribution for the desired parameters, and then repeat until the 
parameters are in an acceptable range.  I could then sample this 
'population' using the sample function, and go from there.  One major 
problem with this approach is that it is very time consuming on a desktop 
machine.

Here's a piece of code (from a friend, and untested)  that will do such a 
thing:

#
#  Function to generate a vector containing values from a truncated
#  negative binomial distribution (i.e., no zeros).  Select desired
#  mean and variance, sample size, initial values for mean and theta,
#  and a threshold value for tests.
#
#  format:  out<-nbin(desired_mean, desired_variance, n, initial_mean,
#                     theta, threshold_level)
#
#  example:  out<-nbin(2, 1, 100, 2, 2, 0.1)
#
#
nbin<-function(mu.s,var.s,n,mu.i,theta,test)
{
   library(MASS)

   mu<-0
   var<-0
   rand<-rep(0,n)
   while(abs(mu.s-mu)>=test & abs(var.s-var)>=test)
   {
     for(i in 1:n)
     {
      rnb<-0
      while(rnb==0)
       rnb<-rnegbin(1,mu.i,theta)
      rand[i]<-rnb
     }
     mu<-mean(rand)
     var<-var(rand)
   }
   return(rand)
}



As I am wanting to use these samples to compute power for a bootstrap 
procedure, the time demands will become prohibitive very quickly, 
especially for large sample sizes.
I'm thinking that there must be a more efficient, elegant, and quicker 
solution to the problem, but am having problems coming up with the answer.
I'd greatly welcome any insight into a more efficient method.

thanks in advance for any insight,
Chris O'Brien




More information about the R-help mailing list