[R] merging data frames gives all NAs

James Rome jamesrome at gmail.com
Tue Feb 2 16:42:19 CET 2010


David,

Now the code is:
for (j in seq_along(rwy)) { # subset the data and merge them
ar4rw = ar4rw <- subset(arrgnd, arrgnd$Runway==rwy[j])
if(j == 1) {
arrw = ar4rw
}
else {
arrw = merge(arrw, ar4rw)
}
}

I attach the data. I needed 500 rows to get both runways in rwy.

The suggestions did not help much, but did get rid of the row of NAs in
ar4rw. Why?
When I run through the loop for 2 runways, I get

# j = 1, Runway = "31L"
Browse[1]> arrw[1:3,]
DateTime Date month hour minute quarter weekday IATA ICAO Flight
552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA AAL AAL22
563 1/1/09 23:17 2009-01-01 1 23 17 93 5 DL DAL DAL242
565 1/1/09 23:24 2009-01-01 1 23 24 93 5 DL DAL DAL624
AircraftType Tail Arrived STA Runway FromTo Delay
552 B762 N329AA 23:03:35 23:10 * 31L* LAX /JFK 0
563 B763 N1611B 23:17:37 23:46 31L KATL /KJFK 0
565 B752 N654DL 23:24:04 23:48 31L LAS /JFK 0
Operator dq gw
552 AMERICAN AIRLINES 2009-01-01 92 1
563 DELTA AIR LINES 2009-01-01 93 1
565 DELTA AIR LINES 2009-01-01 93 1
# j = 2 Runway="31R"
Browse[1]> ar4rw[1:3,]
DateTime Date month hour minute quarter weekday IATA ICAO Flight
529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA TAI TAI570
530 1/1/09 21:48 2009-01-01 1 21 48 87 5 AA AAL AAL2018
531 1/1/09 21:50 2009-01-01 1 21 50 87 5 BA BAW BAW183
AircraftType Tail Arrived STA Runway FromTo Delay
529 A320 N496TA 21:46:58 22:30 * 31R* MSLP /KJFK 0
530 B752 N621AM 21:48:43 21:50 31R TLPL /JFK 0
531 B744 G-CIVI 21:50:26 22:50 31R EGLL /KJFK 0
Operator dq gw
529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1
530 AMERICAN AIRLINES 2009-01-01 87 1
531 BRITISH AIRWAYS 2009-01-01 87 1
# But the merge gives all NAs!
]> arrw[1:3,]
DateTime Date month hour minute quarter weekday IATA ICAO Flight
NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA>
NA.1 <NA> <NA> NA NA NA NA NA <NA> <NA> <NA>
NA.2 <NA> <NA> NA NA NA NA NA <NA> <NA> <NA>
AircraftType Tail Arrived STA Runway FromTo Delay Operator dq gw
NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> <NA> NA
NA.1 <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> <NA> NA
NA.2 <NA> <NA> <NA> <NA> <NA> <NA> NA <NA> <NA> NA

Thanks,
Jim Rome

On Feb 1, 2010, at 5:30 PM, David Winsemius wrote:

>
> On Feb 1, 2010, at 5:16 PM, James Rome wrote:
>
>> Dear kind R helpers,
>>
>> I have a vector of runway names in rwy ("31R", "31L",... the number
>> is user selectable)
>> arrgnd is a data frame with data for all flights and all runways,
>> with a Runway column.
>> I am trying to subset arrgnd into a dat frame for each selected
>> runway, and then combine them back together using the following code:
>>
>> for (j in 1:nr) { # nr = number of user-selected runways
>
> Safer would be:
>
> for (j in seq_along(rwy) {
>
>> ar4rw = arrgnd[arrgnd$Runway==rwy[j],]
>
> Clearer would be :
>
> ar4rw <- subset(arrgnd, Runway= j) # and I think the NA line's will
> also disappear.
^ == ^
>
>
>> if (j == 1) {
>> arrw = ar4rw
>> }
>> else {
>> arrw = merge(arrw, ar4rw)
>> }
>> }
>
> You really should give us something like:
>
> dput(rwy)
> dput( head(arrgnd, 10) )
>>
>> but, the merge step gives me a data frame with all NAs. In addition,
>> ar4rw always gets a row with NAs at the start, which I do not
>> understand. There are no rows with all NAs in the arrgnd data frame.
>> > ar4rw[1:2,] # first time through for 31R
>> DateTime Date month hour minute quarter weekday IATA ICAO Flight
>> NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA>
>> 529 1/1/09 21:46 2009-01-01 1 21 46 87 5 TA TAI TAI570
>> AircraftType Tail Arrived STA Runway FromTo Delay
>> NA <NA> <NA> <NA> <NA> <NA> <NA> NA
>> 529 A320 N496TA 21:46:58 22:30 31R MSLP /KJFK 0
>> Operator dq gw
>> NA <NA> <NA> NA
>> 529 TACA INTERNATIONAL AIRLINES 2009-01-01 87 1
>>
>> > ar4rw[1:2,] # second time through for 31L
>> DateTime Date month hour minute quarter weekday IATA ICAO Flight
>> NA <NA> <NA> NA NA NA NA NA <NA> <NA> <NA>
>> 552 1/1/09 23:03 2009-01-01 1 23 3 92 5 AA AAL AAL22
>> AircraftType Tail Arrived STA Runway FromTo Delay Operator
>> NA <NA> <NA> <NA> <NA> <NA> <NA> NA <NA>
>> 552 B762 N329AA 23:03:35 23:10 31L LAX /JFK 0 AMERICAN AIRLINES
>> dq gw
>> NA <NA> NA
>>
>> But after the merge, I get all NAs. What am I doing wrong?
>
> The data layout gets mangled and I cannot tell what rows are being
> matched to what. Use dput to convey an unambiguous, and easily
> replicated example.
>>
>> Thanks,
>> Jim Rome
>>
>> 552 2009-01-01 92 1
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list