[R] Random # generator accuracy

Thu Jul 23 21:36:07 CEST 2009

Well one quick way (for non-generics) is the 'args' function:

> args(sample)
function (x, size, replace = FALSE, prob = NULL) 
NULL

A similar line appears near the top of the help page when you do '?sample'.  The "replace = FALSE" in the line above means that false is the default (with the assumption that FALSE => NO Replacement).  The context of the first example and part of the details section supports that 'without replacement' is the default, but you are correct that it could be clearer (some help pages list the defaults for each of the arguments).  Some functions don't show the default, or use NULL as the default so that you need to read closer in the argument list or details about what will happen when that argument is left blank.  

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: Jim Bouldin [mailto:jrbouldin at ucdavis.edu]
> Sent: Thursday, July 23, 2009 12:49 PM
> To: Greg Snow; r-help at r-project.org
> Subject: RE: [R] Random # generator accuracy
> 
> 
> Thanks Greg, that most definitely was it.  So apparently the default is
> sampling without replacement.  Fine, but this brings up a question I've
> had
> for a bit now, which is, how do you know what the default settings are
> for
> the arguments of any given function?  The HTML help files don't seem to
> indicate in many (most) cases.  Thanks.
> 
> > Try adding replace=TRUE to your call to sample, then you will get
> numbers
> > closer to what you are expecting.
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.snow at imail.org
> > 801.408.8111
> >
> >
> > > -----Original Message-----
> > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > > project.org] On Behalf Of Jim Bouldin
> > > Sent: Thursday, July 23, 2009 12:00 PM
> > > To: r-help at r-project.org
> > > Subject: [R] Random # generator accuracy
> > >
> > >
> > > Dan Nordlund wrote:
> > >
> > > "It would be necessary to see the code for your 'brief test' before
> > > anyone
> > > could meaningfully comment on your results.  But your results for a
> > > single
> > > test could have been a valid "random" result."
> > >
> > > I've re-created what I did below.  The problem appears to be with
> the
> > > weighting process: the unweighted sample came out much closer to
> the
> > > actual
> > > than the weighted sample (>1% error) did.  Comments?
> > > Jim
> > >
> > > > x
> > >  [1]  1  2  3  4  5  6  7  8  9 10 11 12
> > > > weights
> > >  [1] 1 1 1 1 1 1 2 2 2 2 2 2
> > >
> > > > a = mean(replicate(1000000,(sample(x, 3, prob = weights))));a  #
> (1
> > > million samples from x, of size 3, weighted by "weights"; the mean
> > > should
> > > be 7.50)
> > > [1] 7.406977
> > > > 7.406977/7.5
> > > [1] 0.987597
> > >
> > > > b = mean(replicate(1000000,(sample(x, 3))));b  # (1 million
> samples
> > > from
> > > x, of size 3, not weighted this time; the mean should be 6.50)
> > > [1] 6.501477
> > > > 6.501477/6.5
> > > [1] 1.000227
> > >
> > >
> > > Jim Bouldin, PhD
> > > Research Ecologist
> > > Department of Plant Sciences, UC Davis
> > > Davis CA, 95616
> > > 530-554-1740
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> Jim Bouldin, PhD
> Research Ecologist
> Department of Plant Sciences, UC Davis
> Davis CA, 95616
> 530-554-1740