[R] How to join the two tables based on one overlapped column

Daniel Malter daniel at umd.edu
Wed Oct 8 17:46:03 CEST 2008


dataToMerge=data.frame(yourtablename2$PROBE_ID2, yourtablename2$SEARCH_KEY2)
##Puts the two columns of interest in dataset 2 in a separate data frame.

mergedData=merge(yourtablename1,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY
2,all.x=T,all.y=F)
##merges the first table with the data frame just created looking for
matches between SEARCH_KEY1 and SEARCH_KEY2, all entries in dataset 1 are
retained (whether matched or not), entries in the dataToMerge dataframe that
do not match any entries in dataset 1 are dropped.

You then have an additional column in "mergedData" that contains the
PROBE_ID2 and you can just assign them to PROBE_ID1 (i.e. replace PROBE_ID1
by the values in this column).

Cheers,
Daniel

-------------------------
cuncta stricte discussurus
-------------------------

-----Ursprüngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von ss
Gesendet: Wednesday, October 08, 2008 11:33 AM
An: R help
Betreff: [R] How to join the two tables based on one overlapped column

Dear list,

I need some clues on this.  I have two excel files and I basically want to
map one to the other one. Can you give me some hints how to do it?

The first excel file, named as "Susan_probe.xls", there are two columns,
"PROBE_ID1" and "SEARCH_KEY1"

  PROBE_ID1 SEARCH_KEY1  ILMN_30212 ILMN_30212  ILMN_1285 ILMN_1285
ILMN_137964 ILMN_137964  ILMN_138109 ILMN_138109 ...

The second excel file, named as "John_probe.xls", there are two columns as
well, "PROBE_ID2" and "SEARCH_KEY2".

  PROBE_ID2 SEARCH_KEY2  ILMN_1809034 ILMN_16367  ILMN_1660305 ILMN_16583
ILMN_1792173 ILMN_19158 ...

There are 46713 rows in the first excel file and 49702 rows in the second
file.

Probes in the first columns of two excel files are different but they can be
matched based on the second column "SEARCH_KEY". So what I want to do is to
substitute the "PROBE_ID1" in the "Susan_probe.xls" file with the
"PROBE_ID2" in the "John_probe.xls" based on their common "SEARCH_KEY".

Thank you so much for your help. I really appreciate.

All the best,
      Allen

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list