[R] sampling random groups with all observations in the group

Greg Snow Greg.Snow at intermountainmail.org
Fri Mar 2 22:42:27 CET 2007


One possibility is to use split to create a list with each of your
groups as an element, sample from the list, then combine back into a
data frame.  For example:

> mydata <- data.frame(group=sample(LETTERS[1:5], 100, replace=TRUE),
+ x= 1:100, y= rnorm(100) )
> head(mydata)
  group x          y
1     B 1 -1.1709539
2     A 2  0.2438249
3     C 3 -1.9079472
4     E 4  0.6155387
5     E 5 -1.0671110
6     C 6  0.8109344
> mydata2 <- split(mydata, mydata$group)
> mysamp <- sample(5,2)
> mydata3 <- do.call('rbind',mydata2[mysamp])
> summary(mydata3)
 group        x               y          
 A: 0   Min.   : 3.00   Min.   :-1.9079  
 B: 0   1st Qu.:18.75   1st Qu.:-0.9798  
 C:17   Median :46.50   Median :-0.4309  
 D:19   Mean   :45.19   Mean   :-0.2333  
 E: 0   3rd Qu.:68.25   3rd Qu.: 0.4351  
        Max.   :97.00   Max.   : 3.0469  
> 

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
 
 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Wadud, Zia
> Sent: Friday, March 02, 2007 1:12 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] sampling random groups with all observations in the group
> 
> Hi
> I have a panel dataset with large number of groups and 
> differing number of observations for each group. I want to 
> randomly select say, 20% of the groups or 200 groups, but 
> along with all observations from the selcted groups (with the 
> corresponding data). 
> I guess it is possible to generate a random sample from the 
> groups ids and then match that with the entire dataset to 
> have the intended dataset, but it sounds cumbersome and 
> possibly there is an easier way to do this? checked the 
> package 'sampling' or command 'sample', but they cant do 
> exactly the same thing.
> I was wondering if someone on this list will be able to share 
> his/her knowldege?
> Thanks in advance,
> Zia
> **********************************************************
> Zia Wadud
> PhD Student
> Centre for Transport Studies
> Department of Civil and Environmental Engineering Imperial 
> College London London SW7 2AZ Tel +44 (0) 207 594 6055
>  
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list