[R] Problems with merge

Vikas Rawal vikas at mail.jnu.ac.in
Wed Oct 6 07:14:05 CEST 2004


This issue has been discussed on this list before but the solutions 
offerred are not satisfactory. So I thought I shall raise it again.

I want to merge two datasets which have three common variables. These 
variables DO NOT have the same names in both the files. In addition, 
there are two variables with same name which do not necessarily have 
exactly same data. That is, there could be some discrepancy between the 
two datasets when it comes to these variables. I do not want them to be 
used when I merge the datasets.

The problem is that R allows you to use by.x and by.y variables to 
specify only one variable in x dataset and one variable in y dataset to 
merge. Otherwise, if you do not specify anything, it matches all the 
variables that have common names to merge. This is very problemmatic. In 
my case, the variables I want to use to match do not have same names in 
two datasets and the ones that have same names must not be used to match.

One approach will be to change names of variables and then merge. But 
that is not elegant, to say the least.

If nothing else works, that is what I shall have to do. There again we 
have some problem. How do I change the name of a particular column. One 
solution suggested somewhere in the archives of the list is to use

names(data.frame)=c(list of column names)

But this requires you to list all the variable names. That can obviously 
be cumbersome when you have large number of variables. What would be the 
syntax if I want to change just one column name.

Vikas




More information about the R-help mailing list