[Rd] Are r2dtable and C_r2dtable behaving correctly?

Martin Maechler maechler at stat.math.ethz.ch
Fri Aug 25 18:06:53 CEST 2017


>>>>> Peter Dalgaard <pdalgd at gmail.com>
>>>>>     on Fri, 25 Aug 2017 11:43:40 +0200 writes:

    >> On 25 Aug 2017, at 10:30 , Martin Maechler <maechler at stat.math.ethz.ch> wrote:
    >> 
    > [...]
    >> https://stackoverflow.com/questions/37309276/r-r2dtable-contingency-tables-are-too-concentrated
    >> 
    >> 
    >>> set.seed(1); system.time(tabs <- r2dtable(1e6, c(100, 100), c(100, 100))); A11 <- vapply(tabs, function(x) x[1, 1], numeric(1))
    >> user  system elapsed 
    >> 0.218   0.025   0.244 
    >>> table(A11)
    >> 
    >> 34     35     36     37     38     39     40     41     42     43 
    >> 2     17     40    129    334    883   2026   4522   8766  15786 
    >> 44     45     46     47     48     49     50     51     52     53 
    >> 26850  42142  59535  78851  96217 107686 112438 108237  95761  78737 
    >> 54     55     56     57     58     59     60     61     62     63 
    >> 59732  41474  26939  16006   8827   4633   2050    865    340    116 
    >> 64     65     66     67 
    >> 38     13      7      1 
    >>> 
    >> 
    >> For a  2x2  table, there's really only one degree of freedom,
    >> hence the above characterizes the full distribution for that
    >> case.
    >> 
    >> I would have expected to see all possible values in  0:100
    >> instead of such a "normal like" distribution with carrier only
    >> in [34, 67].

    > Hmm, am I missing a point here?

    >> round(dhyper(0:100,100,100,100)*1e6)
    > [1]      0      0      0      0      0      0      0      0      0      0
    > [11]      0      0      0      0      0      0      0      0      0      0
    > [21]      0      0      0      0      0      0      0      0      0      0
    > [31]      0      0      0      1      4     13     43    129    355    897
    > [41]   2087   4469   8819  16045  26927  41700  59614  78694  95943 108050
    > [51] 112416 108050  95943  78694  59614  41700  26927  16045   8819   4469
    > [61]   2087    897    355    129     43     13      4      1      0      0
    > [71]      0      0      0      0      0      0      0      0      0      0
    > [81]      0      0      0      0      0      0      0      0      0      0
    > [91]      0      0      0      0      0      0      0      0      0      0
    > [101]      0

No, you ain't,  I was. :-(
Martin

    > -- 
    > Peter Dalgaard, Professor,
    > Center for Statistics, Copenhagen Business School
    > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
    > Phone: (+45)38153501
    > Office: A 4.23
    > Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-devel mailing list