[R] How to select a row from one dataframe that is "close" to a row in another dataframe

Daniel Malter daniel at umd.edu
Sat Mar 20 16:52:44 CET 2010


If the flight identifiers runway$Flight and oooi$Flight are unique (i.e.
only one observation has the same identifier in each dataset), you could use
merge() to bind together the dataset based on matching the two. See,

?merge

Also, I see an OnDate variable in both dataset. So if Flight does not
provide unique identification, maybe Flight and OnDate together do, which
can also be handled in merge.

Let us know if that solves the problem.

Best,
Daniel 

-------------------------
cuncta stricte discussurus
-------------------------
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of James Rome
Sent: Saturday, March 20, 2010 10:20 AM
To: r-help at r-project.org
Subject: [R] How to select a row from one dataframe that is "close" to a row
in another dataframe

I have two data frames of flight data,  but they have very different
numbers of rows. They come from different sources, so the data are not
identical.

> names(oooi)
 [1] "FltOrigDt"               "MkdCrrCd"              
 [3] "MkdFltNbr"               "DprtTrpnStnCd"         
 [5] "ArrTrpnStnCd"            "ActualOutLocalTimestamp"
 [7] "ActualOffLocal"          "ActualOnLocal"         
 [9] "ActualInLocal"           "ArrivalGate"           
[11] "DepartureGate"           "Flight"                
[13] "OnDate"                  "MinutesIntoDay"        
[15] "OnHour"                  "pt"  

> names(runway)
 [1] "OnDateTime"     "IATA"           "ICAO"           "Flight"       
 [5] "AircraftType"   "Tail"           "Arrived"        "STA"          
 [9] "Runway"         "From.To"        "Delay"          "OnDate"       
[13] "MinutesIntoDay" "pt"   

These sets have several hundred thousand rows.

In both sets, pt is a POSIXct for the arrival time (from different
sources). They are not identical, but surely should be within an hour of
each other (hopefully a lot less), and the Flight fields must be the
same. So
(abs(runway$pt - oooi$pt) < 3600) & (runway$Flight == oooi$Flight)
should pick out the corresponding rows in the two data sets (if there is
a match).

What I need to do is to take the Runway from runway and insert it into
the oooi df for the correct flight.

What is the best way to do this in R?

Thanks,
Jim Rome

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list