[R] Random selection of a fixed number of values by interval

Mon Dec 14 22:22:29 CET 2015

Yes.

May I suggest:

grp <- c("[0,1)", "[1,2)", "[2,3)", "[3,4)", "[4,5)")

can be obtained more simply as
grp <- levels(groups)[1:5]

 and one slight aesthetic change in the indexing:

from:
samples <- lapply(1:5, function(x) sample(data$id[groups==grp[x]], size[x]))

to:
samples <- lapply(1:5, function(x) sample(data[groups==grp[x],"id"],  size[x]))

(rows and columns in a data frame can be simultaneously indexed)

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Dec 14, 2015 at 12:46 PM, David L Carlson <dcarlson at tamu.edu> wrote:
> There are lots of ways to do this. For example,
>
>> groups <- cut(data$value, include.lowest = T, right = FALSE,
> +      breaks = 0:ceiling(max(data$value)))
>> grp <- c("[0,1)", "[1,2)", "[2,3)", "[3,4)", "[4,5)")
>> size <- c(10, 7, 5, 5, 3)
>> set.seed(42)
>> samples <- lapply(1:5, function(x) sample(data$id[groups==grp[x]],
> +      size[x]))
>> names(samples) <- grp
>> samples
> $`[0,1)`
>  [1] 69 68 33 63 56 46 65 12 50 58
>
> $`[1,2)`
> [1] 20 34 43  8 15 52 19
>
> $`[2,3)`
> [1]  7 22 62 28  2
>
> $`[3,4)`
> [1] 61 53  5 25 21
>
> $`[4,5)`
> [1] 59 35 40
>
>>
>> groups <- cut(data$value, include.lowest = T, right = FALSE,
> +      breaks = 0:ceiling(max(data$value)))
>> grp <- c("[0,1)", "[1,2)", "[2,3)", "[3,4)", "[4,5)")
>> size <- c(10, 7, 5, 5, 3)
>> set.seed(42)
>> samples <- lapply(1:5, function(x) sample(data$id[groups==grp[x]],
> +      size[x]))
>> names(samples) <- grp
>> samples
> $`[0,1)`
>  [1] 69 68 33 63 56 46 65 12 50 58
>
> $`[1,2)`
> [1] 20 34 43  8 15 52 19
>
> $`[2,3)`
> [1]  7 22 62 28  2
>
> $`[3,4)`
> [1] 61 53  5 25 21
>
> $`[4,5)`
> [1] 59 35 40
>
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Frank S.
> Sent: Monday, December 14, 2015 2:02 PM
> To: r-help at r-project.org
> Subject: [R] Random selection of a fixed number of values by interval
>
> Dear R users,
>
> I'm writing to this list because I must get a random sample (without replacement) from a given vector, but the clue is that I need to extract a fixed number of values by each prespecified 1-unit interval. As an example I try to say, I have a data frame that looks like this (my real dataframe is bigger):
>
> data <- data.frame(id = 1:70, value=  c(0.68, 2.96, 1.93, 5.63, 3.08, 3.10, 2.99, 1.79, 2.96, 0.85, 11.79, 0.06, 4.31, 0.64, 1.43, 0.88, 2.79, 4.67,
>       1.23, 1.43, 3.05, 2.44, 2.55, 3.82, 3.55, 1.56, 7.25, 2.75, 9.64, 5.14, 3.54, 3.12, 0.17, 1.07, 4.08, 4.47, 5.58, 7.41, 0.85, 4.30, 7.58,
>       0.58, 1.40, 4.74, 5.04, 0.14, 1.14, 3.28, 7.84, 0.07, 3.97, 1.02, 3.47, 0.66, 2.38, 0.06, 0.67, 0.48, 4.48, 0.12, 3.82, 2.27, 0.93, 0.30,
>       0.73, 0.33, 2.91, 0.81, 0.18, 0.42))
>
> And I would like to select, in a random manner:
>
> 10 id's whose value belongs to [0,1) interval
> 7 id's whose value belongs to [1,2)
> 5 id's whose value belongs to [2,3)
> 5 id's whose value belongs to [3,4)
> 3 id's whose value belongs to [4,5)
>
> # I have the following values by each 1-unit interval:
> table(cut(data$value, include.lowest = T, right = FALSE, breaks = 0:ceiling(max(data$value))))
>
> and the size vector:
> size <- c(10, 7, 5, 5, 3)
>
> But I'm not able to get it by using sample function. Does anyone have some idea?
>
> Thank you very much for any suggestions!!
>
> Frank S.
>
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.