[R] help on permutation/randomization test

Greg Snow Greg.Snow at imail.org
Wed May 25 01:45:38 CEST 2011


If the x's that don't enter at the same time can be considered independent of each other, and only clusters that enter at the same time are dependent, then you can still do a permutation test by creating clusters with dependent values within each cluster, but independent between clusters, then permute the clusters rather than the individual data points.  This maintains the dependency.

I don't know of any existing functions that will do the whole thing for you, but this would only be a few lines of R code to do this type of permutation test.  The split function can help with separating the clusters, sample can do the permutations, and unlist or sapply can be used in calculating the statistic of interest.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Wenjin Mao
Sent: Tuesday, May 24, 2011 11:22 AM
To: Meyners, Michael
Cc: r-help at r-project.org
Subject: Re: [R] help on permutation/randomization test

Thank you, Michael.

I don't think those data for the same group can be treated as repeated
measurements. Let's say I have 1000 observations from group 1 and 1500 obs
from group 2. Some of the 1000 objects of group 1 entered the system at the
same time and may effect each other; same for the other group. It's hard to
measure the heaviness of the dependency.

Even after some twist or transformation, the correlation can be reduced, the
R function "permtest" cannot handle such high sample size. Is there any
other R function I can use?

Thanks,
Wenjin

On Tue, May 24, 2011 at 1:37 AM, Meyners, Michael <meyners.m at pg.com> wrote:

> I suspect you need to give more information/background on the data (though
> this is not primarily an R-related question; you might want to try other
> resources instead). Unless I'm missing something here, I cannot think of ANY
> reasonable test: A permutation (using permtest or anything else) would
> destroy the correlation structure and hence give invalid results, and the
> assumptions of parametric tests are violated as well. Basically, you only
> have two observations, one for each group; with some good will you might
> consider these as repeated measurements, but still on the same subject or
> whatsoever. Hence no way to discriminate the subject from a treatment
> effect. There is not enough data to permute or to rely a statistical test
> on. So unless you can get rid of the dependency within groups (or at least
> reasonably assume observations to be independent), I'm not very
> optimistic...
> HTH, Michael
>
> > -----Original Message-----
> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> > project.org] On Behalf Of Wenjin Mao
> > Sent: Monday, May 23, 2011 20:56
> > To: r-help at r-project.org
> > Subject: [R] help on permutation/randomization test
> >
> > Hi,
> >
> > I have two groups of data of different size:
> >    group A: x1, x2, ...., x_n;
> >    group B: y1, y2, ...., y_m; (m is not equal to n)
> >
> > The two groups are independent but observations within each group are
> > not independent,
> >  i.e., x1, x2, ..., x_n are not independent; but x's are independent
> > from y's
> >
> > I wonder if randomization test is still applicable to this case. Does
> > R have any function that can do this test for large m and n? I notice
> > that "permtest" can only handle small (m+n<22) samples.
> >
> > Thank you very much,
> > Wenjin
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list