[R] Looking for a test of standard normality

Mon Nov 12 08:01:49 CET 2012

This is making less and less sense to me as time goes on.

(1) The ranks of the small sample within the combined sample
have *integer* values and will not be distributed (under the null
hypothesis or otherwise) according to  a *continuous* uniform
distribution.  Hence applying qnorm() makes no sense.

(2) Why attempt to transform to normality anyway?  Just deal directly
with those ranks, which I guess would have (under the null hypothesis)
a discrete uniform distribution on {1, 2, ..., m+n} where m and n are
the sizes of the two samples.

(3) Using ranks as you do sounds to me like re-inventing some form
of non-parametric test of equality of distributions.

(4) I doubt me an you will get much if any more power from such
rank  based tests than you would from the KS test.

(5) If your test, whatever it is, lacks the power to detect the fact
that the two samples are from different distributions, then almost
surely any analysis that you do which is based on the assumption
that the two distributions are the same will be as "correct" as it
can possibly be.  If the data do not contain information which distinguishes
the two distributions, then you might as well analyse the data as if
there is only one distribution.  If the information content ain't there,
it ain't there.

(6) What *practical* knowledge about a real phenomenon would be
revealed if a test rejected the hypothesis that the distributions underlying
the two samples were equal?

     cheers,

         Rolf Turner

On 12/11/12 15:12, Herschtal Alan wrote:
> Thanks for your response. The background is that I am trying to test
> whether a small sample and a much larger sample actually came from the
> same distribution. I could just perform a KS test on the 2 samples, but
> as I said, ideally I'd like a test that is more powerful than that. So I
> look at the percentile ranks of the small sample within the large
> sample, which should be uniformly distributed if the 2 samples are from
> the same population, and then transform using "qnorm". The result should
> be standard normal. Perhaps the next best alternative is to do
> chi-square test on the percentiles, checking for equal numbers in each
> decile bin. This would certainly work, and the only disadvantage that I
> can see is that the selection of the bin boundaries is somewhat
> arbitrary.
>
> Alan Herschtal
> Senior Biostatistician
> Peter MacCallum Cancer Centre
>
> Phone +61 3 9656 3639
> Fax +61 3 9656 1420
> Email alan.herschtal at petermac.org
>
>   
> -----Original Message-----
> From: Rolf Turner [mailto:rolf.turner at xtra.co.nz]
> Sent: Friday, 9 November 2012 2:17 PM
> To: Herschtal Alan
> Cc: r-help at r-project.org
> Subject: Re: [R] Looking for a test of standard normality
>
>
> Others may correct me, but I cannot imagine any test of standard
> normality
> giving appreciably more power than is given by the Kolmogorov-Smirnov
> test.
>
> I also wonder about the point of testing for (standard) normality in the
> first place.  There is a quote --- I think it refers to testing for
> heteroscedasticity,
> but I believe it applies equally to testing for normality  --- about
> such testing
> being analogous to going out of the harbour in a rowing dinghy to see if
>
> it's
> safe for an ocean liner to put to sea.
>
>       cheers,
>
>           Rolf Turner
>
> On 09/11/12 13:23, Herschtal Alan wrote:
>> Dear list members,
>>
>> I am looking for a goodness of test that will tell me if a sample is
>> likely to have come from a standard normal distribution. I can find
>> plenty of omnibus tests for normality in the nor.test package, but
> none
>> of them appear to allow me to test against the specific alternative
> that
>> the data are not standard normal. My back up option is to use a
>> Kolmogorov-Smirnov test, but my impression is that that is not a very
>> powerful test. Any suggestions?