[R] Contatenating data frames with partial overlap in variable names

Marc Schwartz marc_schwartz at comcast.net
Sun Mar 25 04:00:32 CEST 2007


On Sat, 2007-03-24 at 21:47 -0400, Daniel Folkinshteyn wrote:
> Greetings to all.
> I need to concatenate data frames that do not have all the same variable
> names, there is only a partial overlap in the variables. So, for
> example, if i have two data frames, a and b, that look like the following:
> > a
>   a b
> 1 1 4
> 2 2 5
> 3 3 6
> 4 4 7
> 5 5 8
> > b
>   c  a
> 1 1 10
> 2 2 11
> 3 3 12
> 4 4 13
> 5 5 14
> 
> i want to concatenate them by row, without any matching, so that the
> variables that are not available in all frames get NAs. The result
> should look like:
> 
>    a  b  c
> 1  1  4  NA
> 2  2  5  NA
> 3  3  6  NA
> 4  4  7  NA
> 5  5  8  NA
> 6  10 NA 1
> 7  11 NA 2
> 8  12 NA 3
> 9  13 NA 4
> 10 14 NA 5
> 
> rbind doesn't work, since it requires all variables to be matched
> between the two data frames. merge doesn't work, since it wants to
> /match/ by columns with the same name, and if matching by nothing,
> produces a cartesian product.
> 
> is there a neat trick for doing this simply, or am i stuck with
> comparing variable lists and generating NAs manually?
> 
> would appreciate any help!
> Daniel

You can use merge():

> a
  a b
1 1 4
2 2 5
3 3 6
4 4 7
5 5 8

> b
  c  a
1 1 10
2 2 11
3 3 12
4 4 13
5 5 14


Use 'a' as the common 'by' column and specify 'all = TRUE' so that
non-matching values of 'a' will be included in the result:


> merge(a, b, by = "a", all = TRUE)
    a  b  c
1   1  4 NA
2   2  5 NA
3   3  6 NA
4   4  7 NA
5   5  8 NA
6  10 NA  1
7  11 NA  2
8  12 NA  3
9  13 NA  4
10 14 NA  5


See ?merge for more information.

HTH,

Marc Schwartz



More information about the R-help mailing list