[R] nested for() loops for returning a nearest point

Prof Brian Ripley ripley at stats.ox.ac.uk
Wed Jul 30 19:10:53 CEST 2003


For largish datasets, knn1 in package class (in the recommended VR bundle) 
is probably the quickest way to do this.  Something like

knn1(D1[. 1:2], D2[, 1:2], D2$ID)

On Wed, 30 Jul 2003, Roger Bivand wrote:

> On Wed, 30 Jul 2003, Steve Sullivan wrote:
> 
> > I'm trying to do the following:
> > 
> >  
> > 
> > For each ordered pair of a data frame (D1) containing longitudes and
> > latitudes and unique point IDs, calculate the distance to every point in
> > another data frame (D2) also containing longitudes, latitudes and point
> > IDs, and return to a new variable in D1 the point ID of the nearest
> > element of D2.
> 
> I think you can get quite a long way with the function rdist.earth() in 
> the fields package:
> 
> > loc1 <- expand.grid(long=seq(-150,150,5), lat=seq(-70,70,5))
> > dim(loc1)
> [1] 1769    2
> > loc2 <- expand.grid(long=seq(-150,150,7.5), lat=seq(-70,70,7.5))
> > dim(loc2)
> [1] 779   2
> > dists <- rdist.earth(loc1, loc2)
> > id12 <- apply(dists, 1, which.min)
> > length(id12)
> [1] 1769
> > id21 <- apply(dists, 2, which.min)
> > length(id21)
> [1] 779
> 
> using id12 and id21 to choose the point.ids if need be
> 
> > loc2$point.id[id12]
> 
> Roger
> 
> > 
> > Dramatis personae (mostly self-explanatory):
> > 
> > D1$long
> > 
> > D1$lat
> > 
> > D1$point.id
> > 
> > neighbor.id (to be created; for each ordered pair in D1 the point ID of
> > the nearest ordered pair in D2)
> > 
> > D2$long
> > 
> > D2$lat
> > 
> > D2$point.id
> > 
> > dist.geo (to be created)
> > 
> >  
> > 
> > I've been attempting this with nested for() loops that step through each
> > ordered pair in D1, and for each ordered pair [i] in D1 create a vector
> > (dist.geo) the length of D2$lat (say) that contains the distance
> > calculated from every ordered pair in D2 to the current ordered pair [i]
> > of D1, assign a value for D1$neighbor.id[i] based on
> > D2$point.id[(which.min(dist.geo)], and move on to the next ordered pair
> > of D1 to create another dist.geo, assign another neighbor.id, etc.
> > 
> >  
> > 
> > There are no missings/NAs in any of the longs, lats or point.ids,
> > although advice on generalizing this to deal with them would be
> > appreciated.
> > 
> >  
> > 
> > What I've been trying:
> > 
> >  
> > 
> > neighbor.id <- vector(length=length(D1$lat))
> > dist.geo <- vector(length=length(D2$lat))
> > for(i in 1:length(neighbor.id)){
> > for(j in 1:length(dist.geo)){
> > dist.geo[j] <- D1$lat[i]-D2$lat[j]}  
> > 
> > # Yes, I know that isn't the right formula, this is just a test
> > 
> > neighbor.id[i] <- D2$point.id[which.min(dist.geo)]}
> > 
> >  
> > 
> > What I get is a neighbor.id of the appropriate length, but which
> > consists only of the same value repeated.  Should I instead pass the
> > which.min(dist.geo) to a variable before exiting the inner (j) loop, and
> > reference that variable in place of which.min(dist.geo) in the last
> > line?  Or is this whole approach wrongheaded?
> > 
> >  
> > 
> > This should be elementary, I know, so I appreciate everyone's
> > forbearance.
> > 
> >  
> > 
> > Steven Sullivan, Ph.D.
> > 
> > Senior Associate
> > 
> > The QED Group, LLC
> > 
> > 1250 Eye St. NW, Suite 802
> > 
> > Washington, DC  20005
> > 
> > ssullivan at qedgroupllc.com
> > 
> > 202.898.1910.x15 (v)
> > 
> > 202.898.0887 (f)
> > 
> > 202.421.8161 (m)
> > 
> >  
> > 
> > 
> > 	[[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > 
> 
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list