[R] merge multiple data frames

Massimo Bressan mbressan at arpa.veneto.it
Mon Jan 30 13:40:56 CET 2012


hi don

I followed your advice about using sqldf package but the problem of 
labelling the fields persists;
for some reasons I can not properly handle the sql 'as' statement....

a_b<-sqldf("select a.*, b.* from a left join b on a.date=b.date")
a_b_c<-sqldf("select a_b.*, c.* from a_b left join c on a_b.date=c.date")

bye

max





----- Original Message ----- 
From: "MacQueen, Don" <macqueen1 at llnl.gov>
To: "maxbre" <mbressan at arpa.veneto.it>; <r-help at r-project.org>
Sent: Saturday, January 28, 2012 12:24 AM
Subject: Re: [R] merge multiple data frames


Not tested, but this might be a case for the sqldf package.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 1/26/12 9:29 AM, "maxbre" <mbressan at arpa.veneto.it> wrote:

>This is my reproducible example (three data frames: a, b, c)
>
>a<-structure(list(date = structure(1:6, .Label = c("2012-01-03",
>"2012-01-04", "2012-01-05", "2012-01-06", "2012-01-07", "2012-01-08",
>"2012-01-09", "2012-01-10", "2012-01-11", "2012-01-12", "2012-01-13",
>"2012-01-14", "2012-01-15", "2012-01-16", "2012-01-17", "2012-01-18",
>"2012-01-19", "2012-01-20", "2012-01-21", "2012-01-22", "2012-01-23"
>), class = "factor"), so2 = c(0.799401398190476, 0, 0,
>0.0100453950434783,
>0.200154920565217, 0.473866969181818), nox = c(111.716109973913,
>178.077239330435, 191.257829021739, 50.6799951473913, 115.284643540435,
>110.425185027727), no = c(48.8543691516522, 88.7197448817391,
>93.9931932472609, 13.9759949817391, 43.1395266865217, 41.7280296016364
>), no2 = c(36.8673432865217, 42.37150668, 47.53311701, 29.3026882474783,
>49.2986070321739, 46.5978461731818), co = c(0.618856168125,
>0.996593475083333,
>0.666987416083333, 0.383437311166667, 0.281604928875, 0.155383408913043
>), o3 = c(12.1393100029167, 12.3522739816522, 10.9908791203043,
>26.9122200013043, 13.8421695947826, 12.3788847045455), ipa =
>c(167.541954974667,
>252.7196257875, 231.802370709167, 83.4850259595833, 174.394613581667,
>173.868599272609), ws = c(1.47191016429167, 0.765781205208333,
>0.937053086791667, 1.581022406625, 0.909756802125, 0.959252831695652
>), wd = c(45.2650019737732, 28.2493544114369, 171.049080544214,
>319.753674830936, 33.8713897347193, 228.368119533759), temp =
>c(7.91972825883333,
>3.79434291520833, 2.1287644735, 6.733854600625, 3.136579722,
>3.09864120704348), umr = c(86.11566638875, 94.5034087491667,
>94.14451249375, 53.1016709004167, 65.63420423, 74.955669236087
>)), .Names = c("date", "so2", "nox", "no", "no2", "co", "o3",
>"ipa", "ws", "wd", "temp", "umr"), row.names = c(NA, 6L), class =
>"data.frame")
>
>
>b<-structure(list(date = structure(1:6, .Label = c("2012-01-03",
>"2012-01-04", "2012-01-05", "2012-01-06", "2012-01-07", "2012-01-08",
>"2012-01-09", "2012-01-10", "2012-01-11", "2012-01-12", "2012-01-13",
>"2012-01-14", "2012-01-15", "2012-01-16", "2012-01-17", "2012-01-18",
>"2012-01-19", "2012-01-20", "2012-01-21", "2012-01-22", "2012-01-23"
>), class = "factor"), so2 = c(0, 0, 0, 0, 0, 0), nox = c(13.74758511,
>105.8060582, 61.22720599, 11.45280354, 56.86804174, 39.17917222
>), no = c(0.882593766, 48.97037506, 9.732937217, 1.794549972,
>16.32300019, 8.883637786), no2 = c(11.80447753, 25.35235381,
>28.72990261, 8.590004034, 31.9003796, 25.50512403), co = c(0.113954917,
>0.305985964, 0.064001839, 0, 1.86e-05, 0), o3 = c(5.570499897,
>9.802379608, 5.729360104, 11.91304016, 12.13407993, 10.00961971
>), ipa = c(6.065110207, 116.9079971, 93.21240234, 10.5777998,
>66.40740204, 34.47359848), ws = c(0.122115001, 0.367668003, 0.494913995,
>0.627124012, 0.473895013, 0.593913019), wd = c(238.485119317031,
>221.645073036776, 220.372076815032, 237.868340917096, 209.532933617465,
>215.752030286564), temp = c(4.044159889, 1.176810026, 0.142934993,
>0.184606999, -0.935989976, -2.015399933), umr = c(72.29229736,
>88.69879913, 87.49530029, 24.00079918, 44.8852005, 49.47729874
>)), .Names = c("date", "so2", "nox", "no", "no2", "co", "o3",
>"ipa", "ws", "wd", "temp", "umr"), row.names = c(NA, 6L), class =
>"data.frame")
>
>
>c<-structure(list(date = structure(1:6, .Label = c("2012-01-03",
>"2012-01-04", "2012-01-05", "2012-01-06", "2012-01-07", "2012-01-08",
>"2012-01-09", "2012-01-10", "2012-01-11", "2012-01-12", "2012-01-13",
>"2012-01-14", "2012-01-15", "2012-01-16", "2012-01-17", "2012-01-18",
>"2012-01-19", "2012-01-20", "2012-01-21", "2012-01-22", "2012-01-23"
>), class = "factor"), so2 = c(2.617839247, 0, 0, 0.231044086,
>0.944608887, 2.12400444), nox = c(308.9046313, 275.6778849, 390.0824142,
>178.7429364, 238.655832, 251.892601), no = c(156.0262489, 151.4412498,
>221.0725021, 65.96049786, 106.541748, 119.3471241), no2 = c(74.80145447,
>59.29991481, 66.5897975, 77.84267978, 75.68422569, 85.43044816
>), co = c(1.628431197, 1.716231492, 1.264678366, 1.693460745,
>0.780637084, 0.892724398), o3 = c(26.1473999, 15.91584015, 22.46199989,
>37.39400101, 15.63426018, 17.51494026), ipa = c(538.414978, 406.4620056,
>432.6459961, 275.2820129, 435.7909851, 436.8039856), ws = c(4.995530128,
>1.355309963, 1.708899975, 3.131690025, 1.546270013, 1.571320057
>), wd = c(58.15639877, 64.5657153143848, 39.9754269501381,
>24.0739884380921,
>55.9453098437477, 56.7648829092446), temp = c(10.24740028, 7.052690029,
>4.33258009, 13.91609955, 8.762220383, 11.04300022), umr = c(97.60900116,
>96.91899872, 96.20649719, 94.74620056, 82.04550171, 89.41320038
>)), .Names = c("date", "so2", "nox", "no", "no2", "co", "o3",
>"ipa", "ws", "wd", "temp", "umr"), row.names = c(NA, 6L), class =
>"data.frame")
>
>
>Given the three data frames ³a², ³b² and ³c², I need to merge them all by
>the common field ³date².
>The following attempt is working fine butŠ
>
># code start
>all<-merge(a,b,by="date",suffixes=c(".a",".b"),all=T)
>all<-merge(all,c,by="date",all=T)
># code end
>
>ŠI would like to possibly do it in just ³one shot² (a generalisation of
>the
>code for a more complex case of handling many data frames) also by
>assigning
>proper suffixes to each variable (not possible with the previous code
>snippet)
>
>Then I also try a different approach with the use of the library reshape
>and
>the function ³merge_all² butŠ
>
># code start
>library("reshape")
>all.new<-merge_all(a, b, c, by="date", all=T, suffixes=c(".a",".b",".c"))
># code end
>
>ŠI got the following error message:
>error in merge in merge.data.frame(as.data.frame(x), as.data.frame(y),
>...)
>:
>  formal argument "all" associated to different argument passed
>(a free translation from italian)
>
>My question is:
>how to accomplish the merging of multiple data frames with all the same
>variable names and by a single id field with the need of ³keeping track²
>of
>the original data frame by means of the use of suffixes appended to new
>variables?
>
>Any help much appreciated
>
>Thank you
>
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/merge-multiple-data-frames-tp4331089p4331089
>.html
>Sent from the R help mailing list archive at Nabble.com.
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list