[R] uniform integer RNG 0 to t inclusive

Sean O'Riordain seanpor at acm.org
Tue Sep 19 10:03:32 CEST 2006


Hi Duncan,

Thanks for that.  In the light of what you've suggested, I'm now using
the following:

  # generate a random integer from 0 to t (inclusive)
  if (t < 10000000) { # to avoid memory problems...
    M <- sample(t, 1)
  } else {
    while (M > t) {
      M <- as.integer(urand(1,min=0, max=t+1-.Machine$double.eps))
    }
  }

cheers and Thanks,
Sean

On 18/09/06, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> On 9/18/2006 3:37 AM, Sean O'Riordain wrote:
> > Good morning,
> >
> > I'm trying to concisely generate a single integer from 0 to n
> > inclusive, where n might be of the order of hundreds of millions.
> > This will however be used many times during the general procedure, so
> > it must be "reasonably efficient" in both memory and time... (at some
> > later stage in the development I hope to go vectorized)
> >
> > The examples I've found through searching RSiteSearch() relating to
> > generating random integers say to use : sample(0:n, 1)
> > However, when n is "large" this first generates a large sequence 0:n
> > before taking a sample of one... this computer doesn't have the memory
> > for that!
>
> You don't need to give the whole vector:  just give n, and you'll get
> draws from 1:n.  The man page is clear on this.
>
> So what you want is sample(n+1, 1) - 1.  (Use "replace=TRUE" if you want
> a sample bigger than 1, or you'll get sampling without replacement.)
> >
> > When I look at the documentation for runif(n, min, max) it states that
> > the generated numbers will be min <= x <= max.  Note the "<= max"...
>
> Actually it says that's the range for the uniform density.  It's silent
> on the range of the output.  But it's good defensive programming to
> assume that it's possible to get the endpoints.
>
> >
> > How do I generate an x such that the probability of being (the
> > integer) max is the same as any other integer from min (an integer) to
> > max-1 (an integer) inclusive... My attempt is:
> >
> > urand.int <- function(n,t) {
> >   as.integer(runif(n,min=0, max=t+1-.Machine$double.eps))
> > }
> > # where I've included the parameter n to help testing...
>
> Because of rounding error, t+1-.Machine$double.eps might be exactly
> equal to t+1.  I'd suggest using a rejection method if you need to use
> this approach:  but sample() is better in the cases where as.integer()
> will work.
>
> Duncan Murdoch
> >
> > is floor() "better" than as.integer?
> >
> > Is this correct?  Is the probability of the integer t the same as the
> > integer 1 or 0 etc... I have done some rudimentary testing and this
> > appears to work, but power being what it is, I can't see how to
> > realistically test this hypothesis.
> >
> > Or is there a a better way of doing this?
> >
> > I'm trying to implement an algorithm which samples into an array,
> > hence the need for an integer - and yes I know about sample() thanks!
> > :-)
> >
> > { incidentally, I was surprised to note that the maximum value
> > returned by summary(integer_vector) is "pretty" and appears to be
> > rounded up to a "nice round number", and is not necessarily the same
> > as max(integer_vector) where the value is large, i.e. of the order of
> > say 50 million }
> >
> > Is version etc relevant? (I'll want to be portable)
> >> version               _
> > platform       i386-pc-mingw32
> > arch           i386
> > os             mingw32
> > system         i386, mingw32
> > status
> > major          2
> > minor          3.1
> > year           2006
> > month          06
> > day            01
> > svn rev        38247
> > language       R
> > version.string Version 2.3.1 (2006-06-01)
> >
> > Many thanks in advance for your help.
> > Sean O'Riordain
> > affiliation <- NULL
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list