[R] Kolmogorov-Smirnov test

m.marcinmichal m.marcinmichal at gmail.com
Wed Apr 27 23:22:43 CEST 2011


Hi,
I have a problem with Kolmogorov-Smirnov test fit. I try fit distribution to
my data. Actualy I create two test:
- # First Kolmogorov-Smirnov Tests fit
- # Second Kolmogorov-Smirnov Tests fit
see below. This two test return difrent result and i don't know which is
properly. Which result is properly? The first test return lower D = 0.0234
and lower p-value = 0.00304. The lower 'D' indicate that distribution
function (empirical and teoretical) coincide but low p-value indicate that i
can reject hypotezis H0. For another side this p-value is most higer than
p-value from second test (2.2e-16). Which result, test is most propertly?

matr = rbind(c(1,2))
layout(matr) 

# length vectorSentence = 11999
vectorSentence <- c(....)
vectorLength <- length(vectorSentence)

# assume that we have a table(vectorSentence)
#  1    2    3    4    5    6    7    8    9 
# 512 1878 2400 2572 1875 1206  721  520  315 

# Poisson parameter
param <- fitdistr(vectorSentence, "poisson")

# Expected density
density.exp <- dpois(1:9, lambda=param[[1]][1])

# Expected frequ.
frequ.exp <- dpois(1:9, lambda=param[[1]][1])*vectorLength

# Construct numeric vector of data values (y = vFrequ for Kolmogorov-Smirnov
Tests) 
vFrequ <- c()
for(i in 1:length(frequ.exp)) {
	vFrequ <- append(vFrequ, rep(i, times=frequ.exp[i]))
}

# Check transformation plot(density.exp, ylim=c(0,0.20)) ==
plot(table(vFrequ)/vectorLength, ylim=c(0,0.20))
plot(table(vectorSentence)/vectorLength)
plot(density.exp, ylim=c(0,0.20))
par(new=TRUE)
plot(table(vFrequ)/vectorLength, ylim=c(0,0.20))

# First Kolmogorov-Smirnov Tests fit
ks.test(vectorSentence, vFrequ)

# Second Kolmogorov-Smirnov Tests fit
ks.test(vectorSentence, "dpois", lambda=param[[1]][1])

# First Kolmogorov-Smirnov Tests fit return data

Two-sample Kolmogorov-Smirnov test

data:  vectorSentence and vFrequ 
D = 0.0234, p-value = 0.00304
alternative hypothesis: two-sided 

Warning message:
In ks.test(vectorSentence, vFrequ) :
  cannot compute correct p-values with ties


# Second Kolmogorov-Smirnov Tests fit return data

One-sample Kolmogorov-Smirnov test

data:  vectorSentence 
D = 0.9832, p-value < 2.2e-16
alternative hypothesis: two-sided 

Warning message:
In ks.test(vectorSentence, "dpois", lambda = param[[1]][1]) :
  cannot compute correct p-values with ties



Best

Marcin M.

--
View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-tp3479506p3479506.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list