[R] Randomly sampling subsets of dataframe variable

David Winsemius dwinsemius at comcast.net
Fri Mar 12 21:16:22 CET 2010


On Mar 12, 2010, at 3:06 PM, Hosack, Michael wrote:

> Fellow R users,
>
> I am stumped on what would seem to be something fairly simple.
> I have a dataframe that has a variable named 'WEEK' that takes
> the numbers 1:26 (26 week time-period) with each number repeated
> five times consecutively (once for each weekday, Monday through
> Friday). Ex. 111112222233333.....2626262626. I would like to
> randomly extract two weekdays per five day week for each of
> 26 weeks and store this data as a separate dataframe. I have
> been unable to get the sample function to work properly.
> I have also tried using the runif function to assign random
> numbers to each row of my dataframe, sort the dataframe first
> by week number then by random number value, and finally select
> the first two elements from each week subset (26 weeks total,
> giving 52 randomly selected values).  I can't figure out how
> to select the first two elements. My goal is to randomly
> select two weekdays per week (without replacement) for each of
> 26 consecutive weeks. Any advice would be greatly appreciated.

 > replicate(26,sample(1:5, 2))
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [, 
13] [,14] [,15] [,16] [,17]
[1,]    4    1    3    2    3    1    3    5    1     1     2      
4     2     5     1     1     5
[2,]    1    3    4    1    2    3    4    3    3     2     4      
5     1     2     3     5     1
      [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
[1,]     2     4     5     4     5     3     3     4     4
[2,]     4     2     2     1     2     1     1     1     2

 > replicate(26,sample(1:5, 2))[,1]
[1] 1 4

--
David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list