[R] k nearest neighbours

Wed Apr 7 07:21:40 CEST 2004

>

- you could loop over the points in A using sapply so you don't store the
huge matrix
- R accepts logical indices -- you can eliminate which
- R has a rank function so you don't need to define your own.  You probably
want to use the ties="first" or ties="random" arg in that.  

Angel Lopez <angel_lul <at> hotmail.com> writes:

: 
: I want to
: 1) Select for each of the n points in a matrix A, those of the m points 
: in B that lay within a given radius.
: 2) Of those points within the radius, select the k nearest ones.
: 
: What I now do is
: 1) Create an n*m matrix C were I put the distances from all the points 
: in B to the points in A and make NA those cells were the distance is 
: larger than the radius. (The points are geographical locations so I use 
: function rdist.earth in package fields) e.g.:
: library(fields)
: data(ozone)
: A<-cbind(ozone$lon.lat[1:10,])
: B<-cbind(ozone$lon.lat+2)
: C<-rdist.earth(A,B)
: radius<-180 # The search radius
: C[which(C>radius)]<-NA
: 
: 2) Then I make NA everything but the k nearest ones
: k<-5 # The nearest neighbours
: rank<-function(rank){rank<-sort.list(sort.list(rank,))};
: C[which(apply(C,2,rank)>k)]<-NA;
: 
: My problem is that the code is quite slow and due to the need to create 
: a n*m matrix I run out of memory many times. I would also prefer to get 
: a C matrix that is n*k instead of n*m were each of the values in C 
: indicated the row in B were the corresponding knearest point would be.
: But I can not find a way to solve my main problem which is the need to 
: create a n*m matrix.
: Thanks for any clues,
: Angel