[R] Problem with approximate null distribution (package: coin)

Ivan Adzhubey iadzhubey at rics.bwh.harvard.edu
Wed Mar 12 22:10:17 CET 2008


Hi,

I am trying to make use of "approximate" option to obtain null distribution 
through Monte-Carlo resampling, as described in coin library documentation. 
Unfortunately, this does not play well with my data -- permutation process 
swallows astonishingly large amounts of RAM (4-5Gb) and runs far too long (30 
min for B=10). Apparently, this is caused by the size of my dataset (see 
example below) but I was under impression that permutation algorithm just 
draws random contingency tables from the fixed conditional marginals, in 
which case the amount of memory required should not depend on the dataset 
size very much, as well as the execution time should only depend on B. 
Obviously, I was wrong about both assumptions. Is there any reasonable way to 
work around these limitations in case of a large dataset? It's not that large 
in fact, so I am a bit surprised the efficiency of resampling is so poor.

Below is the dataset example, what I am trying to do is perform cmh_test() on 
a 4x2x3 table.

> adata
, , Content = low

    Response
Time     Yes        No
   0     384    597259
   1     585    888039
   2     621    896102
   3    1466   1606456

, , Content = medium

    Response
Time     Yes        No
   0     101     99525
   1     160    191698
   2     173    146814
   3     469    485012

, , Content = high

    Response
Time     Yes        No
   0     119    175938
   1     167    163881
   2      77    131063
   3     522    548924


--Ivan

The information transmitted in this electronic communica...{{dropped:10}}



More information about the R-help mailing list