[R] Allelic Differentiation, sampling, unique(), duplicated()
tlumley at u.washington.edu
Thu Sep 4 17:09:59 CEST 2003
On Fri, 5 Sep 2003, Philip Rhoades wrote:
> Hi people,
> I have made some progress trying to work out how to solve this problem
> but I have got a bit stuck - sorry if this turns out to be a simple
> exercise . .
> Allelic Differentiation (AD) in genetics measures the number of
> different alleles between (say) two populations eg:
> Organisms in Pop 1 have alleles: a, b, c, d, e
> Organisms in Pop 2 have alleles: b, b, c, d, e
> Different (unique) alleles (n) are: a
> [unique() does not do what I want here for comparing these two vectors
> and I can't get combinations of unique() and duplicated() to work
YOu could do it with
and there's probably a direct way to do it with match(). We should
probably have a setsymdiff() function to add to the others.
> Total alleles = 10
> Therefore AD = (2 * n) / 10 = 0.2
> What I want to do is compare two populations of 200 organisms each but
> sampling for only 20 at a time.
> So there are 200!/((200-20)! * 20!) possible combinations of samples in
> each population.
> For all possible combinations of sample pop1 and sample pop2 I want to
> measure AD ie (200!/((200-20)! * 20!) * 200!/((200-20)! * 20!) )
This is far too many calculations
> As well as the unique allele problem, can someone suggest how I can do
> the sampling loops?
You can't. 10^27 is a very large number.
I would suggest choosing pop1 and pop2 at random, a few thousand or
hundred thousand times (depending on the accuracy you need).
More information about the R-help