[R] 2.2e-16 a magic number? ks.test help

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Apr 19 08:12:10 CEST 2008


It says 'less than 2.2e-16'.  The print() method does not report smaller 
values, because they may not be computed as accurately as they appear.

Note that the tests do different things, and you have not shown us how you 
used them.  The Shapiro-Wilk test tests normality (any normal, with 
variable mean and variance).  The Kolmogorov test is for a sample from a 
completely specified distibution -- have you perchance tested for a
N(0, 1) distribution by  ks.test(x, "pnorm")?  Note the comment in the 
help:

      If a single-sample test is used, the parameters specified in '...'
      must be pre-specified and not estimated from the data. There is
      some more refined distribution theory for the KS test with
      estimated parameters (see Durbin, 1973), but that is not
      implemented in 'ks.test'.


On Fri, 18 Apr 2008, Ashton, Gail wrote:

> Hello,
> I'm trying to test my data for normality.
> I enter the data (95ish species counts)
> run >ks.test (data,pnorm)
> and get a p- value <2.2e-16
> But this seems to be the p-value no matter what the data I enter. (I
> have multiple datasets and am testing them all for normality).
> [Actually, I just entered a vector of 1's and the p-value changed.]
> When I use the >Shapiro.test command, I get different p-values
> (different from each other as well as from the ks.test results), and in
> fact one variable that was significantly different from normal using the
> ks.test (p<2.2e-16), is not significantly different from normal using
> the Shapiro.test (p=0.2389).
> I am happy to believe that all my data are not normally distributed, but
> am just suspicious when the p-value remains the same regardless of the
> data (including log and sqrt transformed data).
> Is 2.2e-16 simply the lowest the p-values can go + my data are just
> really not normal?

Well, if they are counts they are from a discrete distribution, so there 
is a sure-fire test here -- the probability of integer observations from a 
normal distribution is 0 in real numbers, and very small in computer 
arithmetic unless the variance is extremely small.

I'm not suggesting that test, but trying to point out the futility of just 
testing normality -- the question needs to be 'is the normal distribution 
an adequate approximation for the analysis I have in nind'.

> Any help much appreciated.
> Thanks,
> Gail
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list