[R] p-values < 2.2e-16 not reported

Will Eagle will.eagle at gmx.net
Thu May 20 00:31:26 CEST 2010


Dear all,

thanks for your feedback so far. With the help of a colleague I think I 
found the solution to my problem:

 > pt(10,100,lower=FALSE)
[1] 4.950844e-17

IS *NOT* EQUAL TO

 > 1-pt(10,100,lower=TRUE)
[1] 0

This means that R is capable of providing p-values < 2.2e-16, however, 
if the value is used in a substraction or addition then the default 
value of the machine epsilon .Machine$double.eps =  2.220446e-16 is 
applied. This causes that all p-values smaller than this threshold are 
set to zero. This problem applies also to other distribution functions 
like pnorm() and others. For your information I would also like to quote 
the relevant part of the R manual on .Machine$double.eps:
"the smallest positive floating-point number x such that 1 + x != 1. It 
equals base^ulp.digits if either base is 2 or rounding is 0; otherwise, 
it is (base^ulp.digits)/ 2.  Normally 2.220446e-16."

Although different opinions were expressed on whether it makes sense to 
differentiate p-values below the machine epsilon, in my opinion 
different effect sizes should correspond with different p-values when 
reporting statistical results. Additionally, in certain scientific 
fields, eg genetics, where usually many tests are performed and simple 
methods, eg Bonferroni method, are used to adjust for multiple testing, 
it is important to know the exact size of the p-value.

Therefore, I would like to suggest that operations of the 2nd variant 
(ie 1-pt(10,100,lower=TRUE)) should be deprecated to calculate p-values 
and operations of the 1st variant (ie pt(10,100,lower=FALSE)) should be 
used  instead. Since I have seen the 2nd variant being frequently used 
(also by very experienced R users) and I assume that it is hidden in 
many statistical test functions, eg cor.test(), this issue seems to me 
quite important.

Best regards,

Will



More information about the R-help mailing list