[R] Randomly sampling subsets of dataframe variable

Chuck Cleland ccleland at optonline.net
Fri Mar 12 21:27:04 CET 2010


On 3/12/2010 3:06 PM, Hosack, Michael wrote:
> Fellow R users,
> 
> I am stumped on what would seem to be something fairly simple. 
> I have a dataframe that has a variable named 'WEEK' that takes 
> the numbers 1:26 (26 week time-period) with each number repeated 
> five times consecutively (once for each weekday, Monday through 
> Friday). Ex. 111112222233333.....2626262626. I would like to
> randomly extract two weekdays per five day week for each of 
> 26 weeks and store this data as a separate dataframe. I have
> been unable to get the sample function to work properly. 
> I have also tried using the runif function to assign random 
> numbers to each row of my dataframe, sort the dataframe first 
> by week number then by random number value, and finally select 
> the first two elements from each week subset (26 weeks total,
> giving 52 randomly selected values).  I can't figure out how
> to select the first two elements. My goal is to randomly 
> select two weekdays per week (without replacement) for each of 
> 26 consecutive weeks. Any advice would be greatly appreciated.

DF <- data.frame(WEEK = rep(1:26, each=5), DAY = rep(1:5, 26), X =
runif(5*26))

DF2 <- data.frame(DAY = c(replicate(26, sample(5, 2, replace=FALSE))),
WEEK = rep(1:26, each=2))

new.DF <- merge(DF, DF2, all=FALSE)

> Thank you,
> 
> Mike
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list