[R] negative P-values with shapiro.test

Mark Cowley m.cowley at garvan.org.au
Wed Jul 16 07:32:30 CEST 2008


Dear list,
I am analysing a set of quantitative proteomics data from 16 patients  
which has a large numbers of missing data, thus some proteins are only  
detected once, upto a maximum of 16.
I want to test each protein for normality by the Shapiro Wilk test  
(function shapiro.test in package stats), which can only be applied to  
data with at least 3 measurements, which is fine. In the case where I  
have only 3 observations, and two of those observations are identical,  
then the shapiro.test produces negative P-values, which should never  
happen.
This occurs for all of the situations that I have tried for 3 values,  
where 2 are the same.

Reproducible code below:
# these are the data points that raised the problem
 > shapiro.test(c(-0.644, 0.0566, 0.0566))

	Shapiro-Wilk normality test

data:  c(-0.644, 0.0566, 0.0566)
W = 0.75, p-value < 2.2e-16

 > shapiro.test(c(-0.644, 0.0566, 0.0566))$p.value
[1] -7.69e-07
# note the verbose output shows a small, but positive P-value, but  
when you extract that P using $p.value, it becomes negative
# various other tests
 > shapiro.test(c(1,1,2))$p.value
[1] -8.35e-07
 > shapiro.test(c(-1,-1,2))$p.value
[1] -1.03e-06

cheers,

Mark

 > sessionInfo()
R version 2.6.1 (2007-11-26)
i386-apple-darwin8.10.1

locale:
en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] tcltk     graphics  grDevices datasets  utils     stats      
methods   base

other attached packages:
  [1] qvalue_1.12.0    Cairo_1.3-5      RSvgDevice_0.6.3  
SparseM_0.74     pwbc_0.1
  [6] mjcdev_0.1       tigrmev_0.1      slfa_0.1          
sage_0.1         qtlreaper_0.1
[11] pajek_0.1        mjcstats_0.1     mjcspot_0.1       
mjcgraphics_0.1  mjcaffy_0.1
[16] haselst_0.1      geomi_0.1        geo_0.1           
genomics_0.1     cor_0.1
[21] bootstrap_0.1    blat_0.1         bitops_1.0-4      
mjcbase_0.1      gdata_2.3.1
[26] gtools_2.4.0

-----------------------------------------------------
Mark Cowley, BSc (Bioinformatics)(Hons)

Peter Wills Bioinformatics Centre
Garvan Institute of Medical Research, Sydney, Australia



More information about the R-help mailing list