[R] Generating unordered, with replacement, samples

Thu Sep 18 01:24:45 CEST 2014

Thank you!

That does exactly what I was looking for.

Best,
Giovanni

________________________________________
From: Duncan Murdoch [murdoch.duncan at gmail.com]
Sent: Wednesday, September 17, 2014 15:02
To: Giovanni Petris; r-help at R-project.org
Subject: Re: [R] Generating unordered, with replacement, samples

On 17/09/2014 3:46 PM, Giovanni Petris wrote:
> Hi Duncan,
>
> You are right. The idea of the derivation consists in 'throwing' k placeholders ("*" in the example below) in the list of the individuals of the population. For example, if the population is letters[1:6], and the sample size is 4, the following code generates uniformly a 'sample'.
>
> > n <- 6; k <- 4
> > set.seed(2)
> > xxx <- rep("*", n + k)
> > ind <- sort(sample(2 : (n+k), k))
> > xxx[setdiff(1 : (n+k), ind)] <- letters[seq.int(n)]
> > noquote(xxx)
>   [1] a b * c d * * e f *
>
> This represents the sample (b, d, d, f). I am still missing the "all" I need to do that you mention, that is how I can transform the vector xxx into something more readily usable, like c(b, d, d, f), or even a summary of counts. I guess I am looking for a bit of R trickery here...

I think this works, but you'd better check!

Sample the placeholders:

ind <- sort( sample(n + k -1, n-1) )  # I don't think sort() is necessary...

Add placeholders at the start and end:

ind <- c(0, ind, n+k)

Take the diffs, and subtract one:

diff(ind) - 1

I think this gives the counts you want.

Duncan Murdoch