[R] p-values < 2.2e-16 not reported

Shi, Tao shidaxia at yahoo.com
Thu May 20 08:52:07 CEST 2010


Will,

 I'm wondering if you have any 
insights after looking at the cor.test source code.  It seems to be fine to me, as the p value is either calculated by "your first method" or a 
.C code.

...Tao



----- Original Message ----
> From: Will Eagle <will.eagle at gmx.net>
> To: r-help at r-project.org
> Sent: Wed, May 19, 2010 3:31:26 PM
> Subject: Re: [R] p-values < 2.2e-16 not reported
> 
> Dear all,

thanks for your feedback so far. With the help of a colleague I 
> think I found the solution to my problem:

> 
> pt(10,100,lower=FALSE)
[1] 4.950844e-17

IS *NOT* EQUAL TO

> 
> 1-pt(10,100,lower=TRUE)
[1] 0

This means that R is capable of 
> providing p-values < 2.2e-16, however, if the value is used in a substraction 
> or addition then the default value of the machine epsilon .Machine$double.eps 
> =  2.220446e-16 is applied. This causes that all p-values smaller than this 
> threshold are set to zero. This problem applies also to other distribution 
> functions like pnorm() and others. For your information I would also like to 
> quote the relevant part of the R manual on .Machine$double.eps:
"the smallest 
> positive floating-point number x such that 1 + x != 1. It equals base^ulp.digits 
> if either base is 2 or rounding is 0; otherwise, it is (base^ulp.digits)/ 
> 2.  Normally 2.220446e-16."

Although different opinions were 
> expressed on whether it makes sense to differentiate p-values below the machine 
> epsilon, in my opinion different effect sizes should correspond with different 
> p-values when reporting statistical results. Additionally, in certain scientific 
> fields, eg genetics, where usually many tests are performed and simple methods, 
> eg Bonferroni method, are used to adjust for multiple testing, it is important 
> to know the exact size of the p-value.

Therefore, I would like to suggest 
> that operations of the 2nd variant (ie 1-pt(10,100,lower=TRUE)) should be 
> deprecated to calculate p-values and operations of the 1st variant (ie 
> pt(10,100,lower=FALSE)) should be used  instead. Since I have seen the 2nd 
> variant being frequently used (also by very experienced R users) and I assume 
> that it is hidden in many statistical test functions, eg cor.test(), this issue 
> seems to me quite important.

Best 
> regards,

Will

______________________________________________

> ymailto="mailto:R-help at r-project.org" 
> href="mailto:R-help at r-project.org">R-help at r-project.org mailing list

> href="https://stat.ethz.ch/mailman/listinfo/r-help" target=_blank 
> >https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting 
> guide http://www.R-project.org/posting-guide.html
and provide commented, 
> minimal, self-contained, reproducible code.



More information about the R-help mailing list