[R] random sampling with some limitive conditions?
Alan Zaslavsky
zaslavsk at hcp.med.harvard.edu
Sun Jul 8 18:40:08 CEST 2007
If I understand your problem, this might be a solution. Assign
independent random numbers for row and column and use the corresponding
ordering to assign the row and column indices. Thus row and column
assignments are independent and the row and column totals are fixed. If
cc and rr are respectively the desired row and column totals, with
sum(cc)==sum(rr), then
n = sum(cc)
row.assign = rep(1:length(rr),rr)[order(runif(n))]
col.assign = rep(1:length(cc),cc)[order(runif(n))]
If you want many such sets of random assignments to be generated at once
you can use a few more rep() calls in the expressions to generate multiple
sets in the same way. (Do you actually want the assignments or just the
tables?) Of course there are many other possible solutions since you have
not fully specified the distribution you want.
Alan Zaslavsky
Harvard U
> From: "Zhang Jian" <jzhang1982 at gmail.com>
> Subject: [R] random sampling with some limitive conditions?
> To: r-help <r-help at stat.math.ethz.ch>
>
> I want to gain thousands of random sampling data by randomizing the
> presence-absence data. Meantime, one important limition is that the row and
> column sums must be fixed. For example, the data "tst" is following:
> site1 site2 site3 site4 site5 site6 site7 site8 1 0 0 0 1 1 0 0 0 1 1 1 0
> 1 0 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 0 0 1 0 1 1 1 1 1 1 0 0
> 0 0 0 0 0 0 0 0 1 0 1 0 1
>
> sum(tst[1,]) = 3, sum(tst[,1])=4, and so on. When I randomize the data, the
> first row sums must equal to 3, and the first column sums must equal to 4.
> The rules need to be applied to each row and column.
> How to get the new random sampling data? I have no idea.
> Thanks.
More information about the R-help
mailing list