[BioC] Wilcoxon test [was loged data or not loged previous to use normalize.quantile]

A.J. Rossini blindglobe at gmail.com
Thu Apr 14 22:29:59 CEST 2005


One citation in this area which is readable is:

Annu Rev Public Health. 2002;23:151-69. Epub 2001 Oct 25. 	Related
Articles, Links

    The importance of the normality assumption in large public health data sets.

    Lumley T, Diehr P, Emerson S, Chen L.

referenced at:

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11910059&dopt=Abstract



On 4/11/05, Matthew  Hannah <Hannah at mpimp-golm.mpg.de> wrote:
> >Not forgetting that the two-sample t-test performs fine under the same
> circumstances (large
> >balanced samples), even for non-normal distributions and unequal
> variances.
> >
> >Regards
> >Gordon
> 
> Does anyone by any chance have a few references for this point,
> particularly for non-normal distributions. I've seen references to
> monte-carlo simulation studies to look at assumption violations but
> being at a biological institute it's difficult to get access to good
> statistics texts. All internet searches just mention 'large' and
> 'balanced' samples. I would be especially interested in 'what if'
> situations like you gave for the wilcoxon test.
> 
> I have group sizes between 0-30, generally unbalanced to some degree
> (mean min/max = 15/25). I know these are not that large (if large at
> all). But I'm looking to 'quantify' what problems I may get comparing
> sample sizes of say 6, 15, 21, 25, 29. If there are also non-normal
> dist, skew and outliers to take into account in some cases.
> 
> I'm wondering if I have unbalanced group size (x > y) whether it would
> reduce the problems of unbalanced variance to
> x1 <- sample(x,y)
> then test (x1,y) for a number (10?) of repeats and then take the maximum
> p.value
> I guess anything with n < 10 would have to be discarded first.
> 
> Looking at the data case by case is not possible with >500 compounds and
> ~20 groups.
> 
> Cheers for any info,
> Matt
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> 


-- 
best,
-tony

"Commit early,commit often, and commit in a repository from which we can easily
roll-back your mistakes" (AJR, 4Jan05).

A.J. Rossini
blindglobe at gmail.com



More information about the Bioconductor mailing list