[Rd] proposal for package that creates the isomarginal family of two-way contingency tables

Frank Technow Frank.Technow at uni-hohenheim.de
Wed May 4 10:49:28 CEST 2011


Hello,

I developed R code to create the isomarginal family of two-way 
contingency tables.
The isomarginal family are all possible tables given fixed margins. It 
is needed for computing exact test statistics for two-way contingency 
tables in frequentist statistics. I need it for computing a 
normalization constant in a Bayesian model.

I think my code can be useful for others as well so I would like to 
publish it as a package, but only if it is not implemented yet. I know 
that there is the function "r2dtable" in "base", but it creates the 
tables randomly, so that some members of the family appear many times, 
some others not at all (exept n is extremely large). Then there is the 
function "permatfull" in package "vegan", but it apparently just uses 
"r2dtable" (from the documentation).


Currently my code covers only the case where the number of rows is fixed 
to two (with arbitrary number of columns). This makes the implementation 
allot easier, but is of course a limitation.

However, most two-way table problems have a structure with multiple 
categories in one margin but a binary variable in the other. For example 
survival yes or no, sex female or male, or in my field, plant genomics, 
where most of the data is from homozygous lines, genotype AA or BB.

To state my question: Is the code worth publishing?


My code is based on the algorithm in:

Greslin (2003)
Counting and enumerating frequency tables with given margins
Statistica & Applicazioni


I would also be interested in suggestions on how to test it.

For now I used:

unique(r2dtable(n,r,c))

as a reference, but to create the whole family in this way requires a 
huge "n", even for tables of modest dimensions and counts. For example: 
the family of a table with dimensions 2x5,
"r = c(8,100)" and "c = c(1,2,2,98,5)", "r" and "c" being the row and 
column margins, has 103 members (from my code). Trying to find it with:

unique(r2dtable(n,r,c))

eats up more than 16Gb of memory!

Thanks in advance,

Frank











-- 
Frank Technow
University of Hohenheim
350 Institute of Plant Breeding, Seed Sciences, and Population Genetics
70593 Stuttgart/Germany
Phone: 0049 711 459 23544
e-mail: Frank.Technow at uni-hohenheim.de or Frank.Technow at gmx.net



More information about the R-devel mailing list