[R] Normal tests disagree?

Wed Dec 2 12:53:27 CET 2009

On 01-Dec-09 23:11:20, rkevinburton at charter.net wrote:
> If I have data that I feed into shapio.test and jarque.bera.test yet
> they seem to disagree. What do I use for a decision?
> 
> For my data set I have p.value of 0.05496421 returned from the
> shapiro.test and 0.882027 returned from the jarque.bera.test. I have
> included the data set below.
> 
> Thank you.
> 
> Kevin
[Data snipped]

The reason is that the Jarque-Bera test (JB) works with the squared
skewness plus 1/4 of (kurtosis - 3)^2. For a Normal distribution,
the skewness is zero and the kurtosis is 3. Hence large values of JB
are evidence against the hypothesis that the distribution is Normal,
in the basis that skewness and/or kurtosis depart from the values
to be expected for a Normal distribution.

However, it is perfectly possible to get skewness near 0, and kurtosis
near 3, for manifestly non-Normal distributions. I get Skewness=0.014
and Kurtosis=3.32 for your data, both quite close to the Normal values.

However, you only have to look at the histogram to see that the
distribution has a distinctly non-Normal appearance:

   hist(x,breaks=20)

The Shapiro-Wilk test, on the other hand, works (broadly speaking)
in terms of standardised spacings between the order statistics
of the sample, compared with what they should be from a Normal.
It is therefore sensitive to features of the sample which are rather
different to the features that the J-B test is sensitive to.

Given the appearance of the histogram, it is to be expected that
many of the spacings between order statistics are different from
what are to be expected from a Normal distribution.

Much of this, and also the insensitivity of the J-B test, arises
from the clump of values at the top of the range (6 out of the 59
between 1.83 and 2.00). Leave these out and you get quite different
results.

The following is an explicit implementation of the J-B test, based
on the Wikipedia description, and using the chi-squared(2) approximation
for the P-value (and also returning the skewness and kurtosis):

  jarque.bera <- function(x){
    m1 <- mean(x)        ; m2 <- mean((x-m1)^2)
    m3 <- mean((x-m1)^3) ; m4 <- mean((x-m1)^4)
    n <- length(x) ; S <- m3/(m2^(3/2)) ; K <- m4/(m2^2)
    JB <- (n/6)*(S^2 + ((K-3)^2)/4)
    P <- 1-pchisq(JB,2)
    list(JB=JB,P=P,S=S,K=K)
  }

For your original data x (as explicitly extracted by Ben Bolker):

  jarque.bera(x)
  # $JB
  # [1] 0.251065
  # $P
  # [1] 0.882027  ##### (As you found yourself)
  # $S
  # [1] 0.01396711
  # $K
  # [1] 3.318352

For the data excluding the 6 values above 1.83:

  jarque.bera(x[x<=1.83])
  # $JB
  # [1] 6.047885
  # $P
  # [1] 0.04860919
  # $S
  # [1] -0.6831185
  # $K
  # [1] 3.933842

So excluding these values has produced a distinct negative skewness
and a kurtosis distinctly greater than 3. Hence those 6 values were
primarily responsible for almost completely eliminating the skewness
and the kurtosis of the remainder of the distribution, and hence
frustrating the J-B test.

Now compare with the Shapiro-Wilk test:

  shapiro.test(x)
  #         Shapiro-Wilk normality test
  # data:  x
  # W = 0.9608, p-value = 0.05496

so the S-W P-value 0.05496 for the full data is close to the
J-B P-value 0.04861 for the reduced data. Now compare with the
S-W test on the reduced data:

  shapiro.test(x[x <= 1.83])
  #         Shapiro-Wilk normality test
  # data:  x[x <= 1.83] 
  # W = 0.9595, p-value = 0.06968

The S-W P-vale has increased slightly (from 0.055 to 0.070),
but the S-W test is still picking up the non-Normality in the
reduced dataset.

The summary is that the S-W test and the J-B test are looking
at different aspects of the data. The J-B test depends only on
two summary functions (skewness and kurtosis) as indices of
non-Normality, while the S-W test is sensitive to a wider
variation of the fine detail of the distribution of the data.

The failure of the J-B test to detect the non-Normality in the
data is primarily due to the fact that the 6 data values at the
top end have, in effect, compensated for the marked skewness
and kurtosis in the remainder of the data.

The ultimate lesson from all this is that different tests test
for different kinds of deperture from the Null Hypothesis.
See also Uwe Ligges's remarks ...

Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 02-Dec-09                                       Time: 11:53:22
------------------------------ XFMail ------------------------------