[R] (Fisher) Randomization Test for Matched Pairs: Permutation Data Setup Based on Signs

R. Michael Weylandt michael.weylandt at gmail.com
Sun Mar 11 03:17:01 CET 2012

In general, I *think* this is a hard problem (it sounds knapsack-ish)
but since you are on small enough data sets, that's probably not so
important: if I understand you right, this little function will help

plusminus <- function(n){
    t(as.matrix(do.call(expand.grid, rep(list(c(-1,1)), n))))

If you multiply the output of this function by your data set you will
have rows corresponding to all possible sign choices: e.g.,

plusminus(3) * c(1,2,3)

Then you can colSums() using only the positive elements:

x <- plusminus(3) * c(1,2,3)
x[x < 0] <- 0


To wrap this all in one function: I'd do something like this:

test.statistic <- function(v){
    m <- t(as.matrix(do.call(expand.grid, rep(list(c(-1, 1)), length(v)))))
    x <- m * v
    x[x<0] <- 0
    out <- rbind(m * v, colSums(x))
    rownames(out)[length(rownames(out))] <- "Sum of Positive Elements"

X <- test.statistic(c(-16, -4, -7, -3, -5, +1, -10))

Hopefully that helps (I'm a little fuzzy on your overall goal -- so
that second bit might be a red herring)


On Fri, Mar 9, 2012 at 12:49 AM, Ghandalf <moolag- at hotmail.com> wrote:
> Hi,
> I am currently attempting to write a small program for a randomization test
> (based on rank/combination) for matched pairs. If you will please allow me
> to introduce you to some background information regarding the test prior to
> my question at hand, or you may skip down to the bold portion for my issue.
> There are two sample sizes; the data, as I am sure you guessed, is matched
> into pairs and each pair's difference is denoted by Di.
> The test statistic =*T* = Sum(Di) (only for those Di > 0).
> The issue I am having is based on the method required to use in R to setup
> the data into the proper structure. I am to consider the absolute value of
> Di, without regard to their sign. There are 2^n ways of assigning + or -
> signs to the set of absolute differences obtained, where n = the number of
> Dis. That is, we can assign + signs to all n of the |Di|, or we might assign
> + to |D1| but - signs to |D2| to |Dn|, and so forth.
>  So, for example, if I have *D1=-16, D2=-4, D3=-7, D4=-3, D5=-5, D6=+1, and
> D7=-10 and n=7. *
> I need to consider the 2^7 ways of assigning signs that result in the lowest
> sum of the "positive" absolute difference. To exemplify further, we have
> *
> -16, -4, -7, -3, -5, -1, -10            T = 0
> -16, -4, -7, -3, -5, +1, -10           T = 1
> -16, -4, -7, +3, -5, -1, -10           T = 3
> -16, -4, -7, +3, -5, +1, -10          T = 4 *
> ... and so on.
> So, if you are willing to help me, I am having trouble on setting up my data
> as illustrated above./ How do I create (a code for) the 2^n lines of data
> required with all the possible combinations of + and - in order to calculate
> the positive values in each line (the test statistic, T)?/ I have tried to
> use combn(d=data set, n=7) with a data set, d, consisting of both the
> positive and negative sign of the respective value, to no avail.
> I apologize if this is lengthy, I was not sure how to ask the aforementioned
> question without incorrectly portraying my thoughts. If any clarification is
> required then I will by more than willing to oblige with any further
> explanation. I have searched for possible solutions, but alas, came out
> empty handed.
> Thank you.
> --
> View this message in context: http://r.789695.n4.nabble.com/Fisher-Randomization-Test-for-Matched-Pairs-Permutation-Data-Setup-Based-on-Signs-tp4458606p4458606.html
> Sent from the R help mailing list archive at Nabble.com.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list