[BioC] Wilcoxon rank sum test on Golub dataset

Björn Usadel usadel at mpimp-golm.mpg.de
Thu Apr 20 12:56:58 CEST 2006


Dear Andrej,

maybe I missunderstand you, since I don't see your datastructure.
But I assume you want to extract
a) 10 genes which are higher in AML than ALL ranked by Wilcox significance
b) 10 genes which are higher in ALL than AML ranked by Wilcox significance

But the wilcox.test that you are using just gives the most significantly 
changed gens
Wilcox by default test if the two samples are different in  either way

Sample
x<-c(1,2,3,4,5)
y<-c(6,7,8,9)
wilcox.test(x,y)
W = 0, p-value = 0.01587
wilcox.text(y,x)
W = 20, p-value = 0.01587

You can either choose to extract by Fold change or test with a different 
hypothesis
e.g.
wilcox.test(x,y,alternative="greater") #Test if x is higher than y
W = 0, p-value = 1
wilcox.test(y,x,alternative="greater") #Test if y is higher than x
W = 20, p-value = 0.007937
Please note that like this, the calculated p-value is halfed compared to 
above, but since you are only interested in ranking that shouldn't matter.

HTH,
björn


>Dear BioC useRs,
>
>I'm working on classical Golub dataset and I would like to select 10 
>genes that are mostly overexpressed in AML, and 10 genes that are mostly 
>overexpressed in ALL by using Wilcoxon rank sums test.
>
>I try with the below code (I paste just the core of the loop) which 
>compute p value for each row, but the result is identical:
>
>wilcox.AML.pvals[i]  <- wilcox.test(aml.i,all.i)$p.value
>wilcox.ALL.pvals[i]  <- wilcox.test(all.i,aml.i)$p.value
>
>Thanks in advance for any suggestion,
>Andrej
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>  
>



More information about the Bioconductor mailing list