[R] Fwd: Re: knn - random result although use.all=TRUE

itziar irigoien itziar.irigoien at ehu.es
Mon Nov 23 12:58:45 CET 2015


Thank you very much for your prompt response. Now I see why the results
have a random part: although all units with tied distances are included
in the neighbourhood, the votes have to be broken at random.

Thank you!

Itziar Irigoien
On or., 2015.eko azaren 20a 16:40, David L Carlson wrote:
> Changing your definition of cl to clase let me replicate the problem. If you set a random seed just before running knn() the results are consistent so that indicates that the function is drawing a random number at some point.
>
> You should probably contact the package maintainer, but your toy data set is trivially simple. You have 40 total observations, but X1 has only 3 different values and X2 has only 2 different values so there are only 6 different combinations. The distance matrix on your training set has 435 distances, but only 5 different values! As a result there are many, many tied values so the algorithm probably uses a random method of selecting which 3 to use.
>
> -------------------------------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352



More information about the R-help mailing list