[R] Matching rows in a Data set? I'm Stuck!!

Marc Schwartz marc_schwartz at me.com
Wed Mar 3 14:30:55 CET 2010


On Mar 3, 2010, at 5:52 AM, BioStudent wrote:

> Hi, I'm having (yet another) problem with R.  
> 
> I have a few data sets that have this sort of format
> 
> dataset1
> ID DATA
> 1234 value
> 2345 value
> 3456 value
> 
> dataset2
> ID DATA
> 1111 value
> 2345 value
> 3333 value
> 
> What i really want to do is write an R script that says "if the ID of
> dataset1 and 2 match (2nd row), print out that whole row into a new
> dataset3". No idea how to do that though. Normally I would just write out
> the files to a txt file then write a perl script that would do just that.
> However these files are HUGE and perl will take forever to do this!! I'm
> hoping theres a quicker solution in R...
> 
> Any help appreciated


See ?merge will performs SQL-like join operations:

> dataset1
    ID   DATA
1 1234 value1
2 2345 value1
3 3456 value1

> dataset2
    ID   DATA
1 1111 value2
2 2345 value2
3 3333 value2

> merge(dataset1, dataset2, by = "ID")
    ID DATA.x DATA.y
1 2345 value1 value2

HTH,

Marc Schwartz



More information about the R-help mailing list