[R] merging or joining 2 dataframes: merge, rbind.fill, etc.?

Anika Masters anika.masters at gmail.com
Wed Feb 27 03:33:59 CET 2013


Thanks Arun and David.  Another issue I am running into are memory
issues when one of the data frames I'm trying to rbind to or merge
with are "very large".  (This is a repetitive  problem, as I am trying
to merge/rbind thousands of small dataframes into a single "very
large" dataframe.)



I'm thinking of creating a function that creates an empty dataframe to
which I can add data, but will need to first determine and ensure that
each dataframe has the exact same columns, in the exact same
"location".



Before I write any new code, is there any pre-existing functions or
code that might solve this problem of "merging small or medium sized
dataframes with a "very large" dataframe.)

On Tue, Feb 26, 2013 at 2:00 PM, David L Carlson <dcarlson at tamu.edu> wrote:
> Clumsy but it doesn't require any packages:
>
> merge2 <- function(x, y) {
> if(all(union(names(x), names(y)) == intersect(names(x), names(y)))){
>     rbind(x, y)
>     } else merge(x, y, all=TRUE)
> }
> merge2(df1, df2)
> df3 <- df1
> merge2(df1, df3)
>
> ----------------------------------------------
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of arun
>> Sent: Tuesday, February 26, 2013 1:14 PM
>> To: Anika Masters
>> Cc: R help
>> Subject: Re: [R] merging or joining 2 dataframes: merge, rbind.fill,
>> etc.?
>>
>> Hi,
>>
>> You could also try:
>> library(gtools)
>> smartbind(df2,df1)
>> #  a  b  d
>> #1 7 99 12
>> #2 7 99 12
>>
>>
>> When df1!=df2
>> smartbind(df1,df2)
>> #   a  b  d  x  y  c
>> #1  7 99 12 NA NA NA
>> #2 NA 34 88 12 44 56
>> A.K.
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Anika Masters <anika.masters at gmail.com>
>> To: r-help at r-project.org
>> Cc:
>> Sent: Tuesday, February 26, 2013 1:55 PM
>> Subject: [R] merging or joining 2 dataframes: merge, rbind.fill, etc.?
>>
>> #I want to "merge" or "join" 2 dataframes (df1 & df2) into a 3rd
>> (mydf).  I want the 3rd dataframe to contain 1 row for each row in df1
>> & df2, and all the columns in both df1 & df2. The solution should
>> "work" even if the 2 dataframes are identical, and even if the 2
>> dataframes do not have the same column names.  The rbind.fill function
>> seems to work.  For learning purposes, are there other "good" ways to
>> solve this problem, using merge or other functions other than
>> rbind.fill?
>>
>> #e.g. These 3 examples all seem to "work" correctly and as I hoped:
>>
>> df1 <- data.frame(matrix(data=c(7, 99, 12) ,  nrow=1 ,  dimnames =
>> list( NULL ,  c('a' , 'b' , 'd') ) ) )
>> df2 <- data.frame(matrix(data=c(88, 34, 12, 44, 56) ,  nrow=1 ,
>> dimnames = list( NULL ,  c('d' , 'b' , 'x' ,  'y', 'c') ) ) )
>> mydf <- merge(df2, df1, all.y=T, all.x=T)
>> mydf
>>
>> #e.g. this works:
>> library(reshape)
>> mydf <- rbind.fill(df1, df2)
>> mydf
>>
>> #This works:
>> library(reshape)
>> mydf <- rbind.fill(df1, df2)
>> mydf
>>
>> #But this does not (the 2 dataframes are identical)
>> df1 <- data.frame(matrix(data=c(7, 99, 12) ,  nrow=1 ,  dimnames =
>> list( NULL ,  c('a' , 'b' , 'd') ) ) )
>> df2 <- df1
>> mydf <- merge(df2, df1, all.y=T, all.x=T)
>> mydf
>>
>> #Any way to get "mere" to work for this final example? Any other good
>> solutions?
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list