[R] New Sampling question

wangwallace talenttree at gmail.com
Thu Nov 18 01:08:07 CET 2010


I have another question about drawing samples from a data frame. This might
sound really tricky. Let me use a data frame I have posted earlier as an
example:

    SubID    CSE1 CSE2 CSE3 CSE4 WSE1 WSE2 WSE3 WSE4
      1          6      5       6       2      6      2        2       4
      2          6      4       7       2      6      6        2       3
      3          5      5       5       5      5      5        4       5
      4          5      4       3       4      4      4        5       2
      5          5      6       7       5      6      4        4       1
      6          5      4       3       6      4      3        7       3
      7          3      6       6       3      6      5        2       1
      8          3      6       6       3      6      5        4       7 

this data frame have two sets of variables. each set simply represent one
scale. as shown above, the first scale, say CSE, consists of four items:
CSE1, CSE2, CSE3, and CSE4, whereas the second scale, say WSE, also has four
items: WSE1, WSE2, WSE3, WSE4.
the leftmost column lists the subjects' ID. 

I wanna create a new data frame through sampling random numbers from the
data frame above. Below is the structure of the new data frame.

    SubID    var    var   var     var 
      s          c      c      c       c      
      s          c      c      c       c      
      s          c      w     w       w      
      s          c      w     w       w          
      s          c      w     w       w        
      s          c      w     w       w        
      s          c      w     w       w        
      s          c      w     w       w

in the new data frame:
 
s= SubID range from 1 to 8
var= variables
c=CSE numbers
w=WSE numbers

some rules to construct the new data frame:

1. the top two rows have to be filled with CSE numbers; the numbers in the
cells of each row should be randomized. for example, if the first row is an
array of numbers from subject 4, they can follow the order: 4(CSE2),
5(CSE1), 3(CSE3), and 4(CSE4). Also, the numbers in the second row does not
have to follow the order of the first row. for example, similarly, if the
first row is an array of numbers from subject 4 in the order: 4(CSE2),
5(CSE1), 3(CSE3), and 4(CSE4), numbers in the second row (assuming it is
from subject 8) does not have to be 6(CSE2), 3(CSE1), 6(CSE3), and 3(CSE4).
numbers in these two rows should be drawn without replacement.

2. each of the rest of the rows should include a CSE number in the leftmost
cell and three WSE numbers on the right. At the same time, in each row, the
three WSE numbers on the right have to be only those numbers that are not
corresponding to the CSE number in the leftmost cell. For example, if the
CSE number in the leftmost cell is 4, a CSE2 number from subject 6, the
three WSE numbers on the right side can only be 4(WSE1), 7(WSE3), and
3(WSE4) from subject 6. 

3. the numbers in each row can only be drawn from the same subject. Also,
Subjects should be randomized. Specifically, they does have to be in the
following order:

 SubID    
      1         
      2          
      3        
      4          
      5          
      6          
      7          
      8    
      
they can be:

 SubID    
      2         
      8          
      5        
      4          
      1          
      6          
      7          
      3

Any ideas?  Thanks in advance!! :)
-- 
View this message in context: http://r.789695.n4.nabble.com/New-Sampling-question-tp3047885p3047885.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list