[R] compare two data frames of different dimensions and only keep unique rows

Arnaud Gaboury arnaud.gaboury at a2ct2.com
Mon Feb 27 18:36:54 CET 2012


Dear list,

I am still struggling with something that should be easy: I compare two data frames with a lot of common rows and want to keep only rows that are NOT in both data frames, unique.

Here are an example of these data frame.

reported <-
structure(list(Product = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 5L, 5L), .Label = c("Cocoa", "Coffee C", "GC", "Sugar No 11", "ZS"), class = "factor"), Price = c(2331, 2356, 2440, 2450, 204.55, 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61L, -61L, 5L, 1L, 40L, 40L, -1L, -1L, -1L, 1L)), .Names = c("Product", "Price", "Nbr.Lots"), row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 5L, 10L, 8L, 9L), class = "data.frame")

exportfile <-
structure(list(Product = c("Cocoa", "Cocoa", "Cocoa", "Coffee C", "Coffee C", "GC", "Sugar No 11", "ZS", "ZS"), Price = c(2331, 2356, 2440, 204.55, 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61, -61, 6, 40, 40, -1, -1, -1, 1)), .Names = c("Product", "Price", "Nbr.Lots"), row.names = c(NA, 9L), class = "data.frame")

I can rbind() them, thus resulting in one data frame with duplicated row, but I have no idea how to delete duplicated rows. I have tried plyaing with unique(), duplicated with no success

v<-rbind(exportfile,reported)
v <-
structure(list(Product = c("Cocoa", "Cocoa", "Cocoa", "Coffee C", 
"Coffee C", "GC", "Sugar No 11", "ZS", "ZS", "Cocoa", "Cocoa", 
"Cocoa", "Cocoa", "Coffee C", "Coffee C", "GC", "Sugar No 11", 
"ZS", "ZS"), Price = c(2331, 2356, 2440, 204.55, 205.45, 17792, 
24.81, 1273.5, 1276.25, 2331, 2356, 2440, 2450, 204.55, 205.45, 
17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61, -61, 6, 40, 
40, -1, -1, -1, 1, -61, -61, 5, 1, 40, 40, -1, -1, -1, 1)), .Names = c("Product", 
"Price", "Nbr.Lots"), row.names = c("1", "2", "3", "4", "5", 
"6", "7", "8", "9", "11", "21", "31", "41", "61", "71", "51", 
"10", "81", "91"), class = "data.frame")


TY for your help

Arnaud Gaboury
 
A2CT2 Ltd.



More information about the R-help mailing list