[R] Kolmogorov-Smirnov tests: overflow

ripley@stats.ox.ac.uk ripley at stats.ox.ac.uk
Sun Jun 23 09:08:43 CEST 2002


Both this and your previous post suggest that your data are from a
discrete distribution (here as they have ties).  The standard distribution
of the KS test is inappropriate: see the first para of `Details' in
?ks.test.

Even if it were not, your data sets would be so large that you would get
statistical significance for practically insignificant differences,
but if you really wanted to get some idea of the p value, there is
a well-known asympototic expansion for the significance levels in terms of
m and n.  My memory is the there is a monograph by Jim Durbin on this,

On Sun, 23 Jun 2002, Arne Mueller wrote:

> Dear All,
>
> I've got a problem with ks.test. I've two realy large vectors, that I'd
> like to test, but I get an overflow, and the p-value cannot be
> calculated:
>
> > length(genomesv)
> [1] 390025
> > length(scopv)
> [1] 140002
> > ks.test(genomesv, scopv)
>
>         Two-sample Kolmogorov-Smirnov test
>
> data:  genomesv and scopv
> D = 0.2081, p-value = NA
> alternative hypothesis: two.sided
>
> Warning messages:
> 1: NAs produced by integer overflow in: n.x * n.y
> 2: NAs produced by integer overflow in: n.x * n.y
> 3: cannot compute correct p-values with ties in: ks.test(genomesv,
> scopv)
>
> Is there anything I can do about this? I'd realy like to know what the
> p-value is ;-)
>
> 	thanks a lot for help,
>
> 	Arne
>
> --
> Arne Mueller
> Biomolecular Modelling Laboratory
> Cancer Research UK, London Research Institute
> 44 Lincoln's Inn Fields
> London WC2A 3PX, U.K.
> phone1 : +44-(0)20-72693405      | fax  : +44-(0)20-75945789
> phone2 : +44-(0)20-75945776      | mobil: +44-(0)7984601749
> email  : a.mueller at cancer.org.uk | http://www.bmm.icnet.uk
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list