[R] intersect two files

Liaw, Andy andy_liaw at merck.com
Wed Aug 11 01:01:26 CEST 2004


You have not given enough info.  Do the data sets have the same columns?  If
not, you need to tell us more about how you can tell whether one row of a
data frame is `identical' to some row of another.

Assuming the columns are the same between the two, the basic idea is to
combine all columns into a single vector for each, then check which elements
of one is in the other.  Something like (code untested!):

id1 <- do.call("paste", c(data1, sep=":")
id2 <- do.call("paste", c(data2, sep=":")
## Rows of data1 that are in data2:
r1 <- which(id1 %in% id2)

## Remove:
data1.reduced <- data1[-r1,]

Andy


> From: Christian Mora
> 
> Hi all;
> Im working with two datasets in R, say data1 and data2. Both datasets
> are composed of several rows and columns (dataframe) and some of the
> rows are identical in both datasets. Im wondering if there is 
> any way to
> remove from one set, say data1, the rows that are identical 
> in the other
> set, say data2, using R?
> Thanks for any hint in advance
> Christian
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list