[R] Data Table Merge Help

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Thu Feb 1 23:24:11 CET 2018

I rarely use data.table, but I think the vignette for the package discusses rolling joins. Also,  Google popped up [1].

[1] https://www.r-bloggers.com/understanding-data-table-rolling-joins/
Sent from my phone. Please excuse my brevity.

On February 1, 2018 9:45:53 AM PST, "Graeve, Nick" <Graeve.Nick at principal.com> wrote:
>I'm not sure if this is an appropriate use of this mailing list or not,
>please let me know if it isn't.  I'm struggling to figure out how to
>merge two data tables based on max effective date logic compared to
>when a payment occurred.  My dtDistributions DT is a transactional
>dataset while dtDepartments is a domain data set containing all
>department names and the effective date of when department name changes
>have occurred.  For the Bob example below, there was a payment on
>2016-01-01 which occurred in H229000.  In 2012, this department was
>named "Modified Name", in 2019 the department will be named "Final
>Name".  When I merge these two tables, I'd like it to pull the
>transactional data and match it up to department name "Modified Name"
>since that was the active department name at the time of that
>transaction.  I've read documentation on foverlaps, but I'm not sure if
>this problem is considered a range of dates or not.  At the bottom of
>this post is a temporarily solution that is working but it runs for a
>long time due to the amount of data in my actual source.
>Here is some sample data to get started:
>dtDistributions <- data.table(PayeeName = c("Bob", "Tracy", "Tom"),
>                          Department = factor(c("H229000", "H135000",
>                          Amount = c(5, 34, 87),
>                          PaymentDT = as.Date(c("2016-01-01",
>"2015-01-01", "2015-01-01")))
>dtDepartments <- data.table(Department = factor(c("H229000", "H229000",
>"H229000", "H135000", "H047800")),
>                        EffDT = as.Date(c("2019-01-01", "2012-01-01",
>"1901-01-01", "1901-01-01", "1901-01-01")),
>                        Descr = c("Final Name","Modified
>Name","Original Name","Payables","Postal"))
>Here is the output I would like to see:
>PayeeName  Department     PaymentDT   Amount
>Bob        Modified Name  2016-01-01  5
>Tracy      Payables       2015-01-01  34
>Tom        Postal         2015-01-01  87
>I was able to get this working by using the sqldf library, but it runs
>for a very long time in my actual dataset and I'd like to use
>data.table if at all possible.
>joinString <- "SELECT A.PayeeName, B.Descr, A.PaymentDT, A.Amount
>            FROM dtDistributions A, dtDepartments B
>            WHERE A.DEPARTMENT = B.Department
>            AND B.EffDT = (SELECT MAX(ED.EffDT)
>                            FROM dtDepartments ED
>                            WHERE B.Department = ED.Department
>                            AND ED.EffDT <= A.PaymentDT)"
>finalDT <- data.table(sqldf(joinString))
>-----Message Disclaimer-----
>This e-mail message is intended only for the use of the individual or
>entity to which it is addressed, and may contain information that is
>privileged, confidential and exempt from disclosure under applicable
>law. If you are not the intended recipient, any dissemination,
>distribution or copying of this communication is strictly prohibited.
>If you have received this communication in error, please notify us
>immediately by reply email to Connect at principal.com and delete or
>destroy all copies of the original message and attachments thereto.
>Email sent to or from the Principal Financial Group or any of its
>member companies may be retained as required by law or regulation.
>Nothing in this message is intended to constitute an Electronic
>signature for purposes of the Uniform Electronic Transactions Act
>(UETA) or the Electronic Signatures in Global and National Commerce Act
>("E-Sign") unless a specific statement to the contrary is included in
>this message.
>If you no longer wish to receive any further solicitation from the
>Principal Financial Group you may unsubscribe at
>https://www.principal.com/do-not-contact-form any time.
>If you are a Canadian resident and no longer wish to receive commercial
>electronic messages you may unsubscribe at
>https://www.principal.com/do-not-email-request-canadian-residents any
>This message was secured by Zix(R).
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>PLEASE do read the posting guide
>and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list