[R] knn - random result although use.all=TRUE

itziar irigoien itziar.irigoien at ehu.es
Thu Nov 19 14:09:46 CET 2015


Dear all,

I have this toy example to work with k-nn classification approach. (My 
data, code and results are at the end of the message)
Working with knn function in library class and setting the parameter 
use.all=TRUE, I would not expect a random answer. Nevertheless I get a 
different answer each time I apply it. Could anyone help me finding out 
what is going on?

Thanks,

Itziar Irigoien

# Generate data
n <- 40
n1 <- 16
n2 <- n-n1
cl <- rep(1:2, c(n1, n2))
set.seed(37)
X1 <- sample(1:3, n, replace=TRUE, prob=rep(1/3, 3))
set.seed(36)
aux1 <- sample(1:2, n1, replace=TRUE, prob=c(0.9, 0.1))
set.seed(38)
aux2 <- sample(1:2, n2, replace=TRUE, prob=c(0.2, 0.8))
X2 <- c(aux1, aux2)
X2 <- X2+3
X2[3] <- 5

#Select training and testing sets
set.seed(36)
t <- sample(1:40, 30, replace=FALSE)
train <- cbind(X1[t], X2[t])
test <- cbind(X1[-t], X2[-t])
out <- knn(train, test, clase[t], k=3, l=0, use.all=TRUE, prob=TRUE)
table(out, clase[-t])
sum(diag(table(out, clase[-t])))/10

# Results I obtained
 > out <- knn(train, test, clase[t], k=3, l=0, use.all=TRUE, prob=TRUE)
 > table(out, clase[-t])

out 1 2
   1 1 2
   2 0 7
 > sum(diag(table(out, clase[-t])))/10
[1] 0.8


 > out <- knn(train, test, clase[t], k=3, l=0, use.all=TRUE, prob=TRUE)
 > table(out, clase[-t])

out 1 2
   1 1 4
   2 0 5
 > sum(diag(table(out, clase[-t])))/10
[1] 0.6



More information about the R-help mailing list