[Rd] proposed change to 'sample'

Patrick Burns pburns at pburns.seanet.com
Sun Jun 20 12:07:53 CEST 2010


There is a weakness in the 'sample'
function that is highlighted in the
help file.  The 'x' argument can be
either the vector from which to sample,
or the maximum value of the sequence
from which to sample.

This can be ambiguous if the length of
'x' is one.

I propose adding an argument that allows
the user (programmer) to avoid that
ambiguity:

function (x, size, replace = FALSE, prob = NULL,
     max = length(x) == 1L && is.numeric(x) && x >= 1)
{
     if (max) {
         if (missing(size))
             size <- x
         .Internal(sample(x, size, replace, prob))
     }
     else {
         if (missing(size))
             size <- length(x)
         x[.Internal(sample(length(x), size, replace, prob))]
     }
}
<environment: namespace:base>


This just takes the condition of the first
'if' to be the default value of the new 'max'
argument.

So in the "surprise" section of the examples
in the 'sample' help file

sample(x[x > 9])

and

sample(x[x > 9], max=FALSE)

have different behaviours.

By the way, I'm certainly not convinced that
'max' is the best name for the argument.

-- 
Patrick Burns
pburns at pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')



More information about the R-devel mailing list