[R] Sample based on Factor Selection Criteria

Josip Dasovic jjd9 at sfu.ca
Mon Jun 1 23:00:48 CEST 2009


Dear R-users:

Hello all:

I'm having difficulty creating a new data frame, which would be a subset of an existing data frame, creaed by the random selection of subsets of observations based on different values of variables within the data frame. 

Here's an example of what my data frame looks like:

fact	x1	x2	x3	select...
blue	23	2.2	1.1	1
blue	28	4.2	0.8	0
blue	34	2.8	0.9	0
...
red	43	6.2	1.4	0	
red	33	5.2	1.5	1
red	35	4.2	1.6	1
...
green	22	3.5	1.1	0
green	21	4.5	1.3	0
green	33	6.5	1.7	0
green	12	4.4	1.9	0
...

There hundreds of different values (i.e., "colours") of the variable "fact" within my dataset, each of which has dozens of observations (that is, there are about 50 observations with the "fact" value blue, 45 with red, 87 with magenta, etc.).

I would like to end up with a new data frame, which is a subset of my original data frame. The new (subsetted) data frame would have the following characteristics:

1) It would retain all of the observations for which "select"==1
2) It would retain a random sample of the observations for which "select"==0, such that there is one randomly sampled observation within each set of observations for which "fact" is the same value, and whose "select" value==1.

Thus, in the above example, I would like to retain 
i) the first "blue" observation, and one additional randomly-selected "blue" observation for which select==0, 
ii) the 2nd and 3rd "red" observations, and two more randomly-selected "red" observations for which "select"==0, 
iii) none of the "green" observations, since none of these has a "select" value of 1.

So, the new data set would look something like this:

fact	x1	x2	x3	select
blue	23	2.2	1.1	1
blue	34	2.8	0.9	0
red	43	6.2	1.4	0	
red	33	5.2	1.5	1
red	35	4.2	1.6	1
red	28	4.4	1.4	0

Thank you for your help,
Josip


Josip Dasovic
Research Associate
Human Security Report Project
School of International Studies
Suite 7200
Simon Fraser University
515 West Hastings Street
Vancouver , BC
CANADA
V6B 5K3




More information about the R-help mailing list