[R] inner join sqldf

Gabor Grothendieck ggrothendieck at gmail.com
Mon Mar 15 13:33:25 CET 2010


The R code posted has syntax errors (there are no quotes around the
URL) and the SQL statement also has syntax errors (there is no from
clause).  Please correct and post again.

On Mon, Mar 15, 2010 at 8:19 AM, Newbie19_02 <nvanzuydam at gmail.com> wrote:
>
> Hi,
>
> I have two dataframes that have some common columns.   I would like to join
> them by the common columns prochi and prescribed_date as there are duplicate
> prochis but they will be made unique by date.  I tried doing an inner join
> but that just duplicated the columns whereas I would like the information
> from the test_sql_tsf to fill the NAs in the test_sql_psd common columns.
>
> require(sqldf)
>
> test_sql_psd <-  http://n4.nabble.com/file/n1593282/test_sql_psd.txt
> test_sql_psd.txt , header=TRUE, sep="\t", dec=".", na.strings="NA",
> check.names=TRUE, quote= "\"'")
>
>
> test_sql_tsf <-read.table(file=
> http://n4.nabble.com/file/n1593282/test_sql_tsf.txt test_sql_tsf.txt , ,
> header=TRUE, sep="\t", dec=".", na.strings="NA", check.names=TRUE, quote=
> "\"'")
>
>
> test_sql_innerjoin <- sqldf("select * test_sql_psd inner join test_sql_tsf
> on test_sql_psd.prochi=test_sql_tsf.prochi")
>
> colnames(test_sql_psd)
>  [1] "prochi"           "prescribed_date"  "dataMonth"        "item_code"
>  [5] "res_seqno"        "quantity"         "directions"       "no_of_packs"
>  [9] "datasource"       "scan_ref_no"      "name"
> "approved_name"
> [13] "formulation_code" "strength"         "measure_code"     "bnf_code"
> [17] "bnf_description"
>
>  colnames(test_sql_tsf)
>  [1] "prochi"           "prescribed_date"  "dataMonth"        "item_code"
>  [5] "res_seqno"        "quantity"         "directions"       "no_of_packs"
>  [9] "datasource"       "scan_ref_no"      "name"
> "formulation_code"
> [13] "strength"         "bnf_code"
>
>
> with the result of:
>
> colnames(test_sql_innerjoin)
>  [1] "prochi"           "prescribed_date"  "dataMonth"        "item_code"
>  [5] "res_seqno"        "quantity"         "directions"       "no_of_packs"
>  [9] "datasource"       "scan_ref_no"      "name"
> "formulation_code"
> [13] "strength"         "bnf_code"         "prochi"
> "prescribed_date"
> [17] "dataMonth"        "item_code"        "res_seqno"        "quantity"
> [21] "directions"       "no_of_packs"      "datasource"       "scan_ref_no"
> [25] "name"             "approved_name"    "formulation_code" "strength"
> [29] "measure_code"     "bnf_code"         "bnf_description"
>
>
> I'm not sure if I am using the correct sqldf command or if there is an
> easier way to do this from the start.  I also tried
> test_sql_union<-sqldf("select * test_sql_tsf union select * test_sql_psd")
> which gave me the same result as inner join.  I'm not sure if I am using the
> correct commands for what I want to do?
>
>
>
> Thanks for your help.
> Natalie
>
> --
> View this message in context: http://n4.nabble.com/inner-join-sqldf-tp1593282p1593282.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list