[R] Randomly select elements based on criteria

Petr Savicky savicky at cs.cas.cz
Thu Mar 22 22:27:47 CET 2012


On Thu, Mar 22, 2012 at 11:42:53AM -0700, aly wrote:
> Hi,
> 
> I want to randomly pick 2 fish born the same day but I need those
> individuals to be from different families. My table includes 1787 fish
> distributed in 948 families. An example of a subset of fish born in one
> specific day would look like:
> 
> >fish
> 
> fam   born  spawn
> 25	46	43
> 25	46	56
> 26	46	50
> 43	46	43
> 131	46	43
> 133	46	64
> 136	46	43
> 136	46	42
> 136	46	50
> 136	46	85
> 137	46	64
> 142	46	85
> 144	46	56
> 144	46	64
> 144	46	78
> 144	46	85
> 145	46	64
> 146	46	64
> 147	46	64
> 148	46	78
> 149	46	43
> 149	46	98
> 149	46	85
> 150	46	64
> 150	46	78
> 150	46	85
> 151	46	43
> 152	46	78
> 153	46	43
> 156	46	43
> 157	46	91
> 158	46	42
> 
> Where "fam" is the family that fish belongs to, "born" is the day it was
> born (in this case day 46), and "spawn" is the day it was spawned. I want to
> know if there is a correlation in the day of spawn between fish born the
> same day but that are unrelated (not from the same family). 
> I want to randomly select two rows but they have to be from different fam.
> The fist part (random selection), I got it by doing:
> 
> > ran <- sample(nrow (fish), size=2); ran
> 
> [1]  9 12
> 
> > newfish <- fish [ran,];  newfish
> 
>     fam born spawn
> 103 136   46    50 
> 106 142   46    85 
> 
> In this example I got two individuals from different families (good) but I
> will repeat the process many times and there's a chance that I get two fish
> from the same family (bad):
> 
> > ran<-sample (nrow(fish), size=2);ran
> 
> [1] 26 25
> 
> > newfish <-fish [ran,]; newfish
> 
>     fam born spawn
> 127 150   46    85
> 126 150   46    78
> 
> I need a conditional but I have no clue on how to include it in the code.

Hi.

Try the following.

  while (1) {
    ran <- sample(nrow(fish), size=2)
    if (fish[ran[1], 1] != fish[ran[2], 1]) break
  }
  fish[ran, ]

This will generate only pairs from different families. However,
note that the loop will run forever, if the data contain only
fish from one family.

Hope this helps.

Petr Savicky.



More information about the R-help mailing list