[R] Merge with closest (not equal) time stamps

Naresh Gurbuxani n@re@h_gurbux@n| @end|ng |rom hotm@||@com
Tue Aug 8 04:31:48 CEST 2023


I have two dataframes, each with a column for timestamp.  I want to
merge the two dataframes such that each row from first dataframe
is matched with the row in the second dataframe with most recent but
preceding timestamp. Here is an example.

option.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:22", "2023-08-07 10:25:33", "2023-08-07 10:28:41")), option.price = c(2.5, 2.7, 1.8))

stock.trades <- data.frame(timestamp = as.POSIXct(c("2023-08-07 10:23:21", "2023-08-07 10:23:34", "2023-08-07 10:24:57", "2023-08-07 10:28:37", "2023-08-07 10:29:01")), stock.price = c(102.2, 102.9, 103.1, 101.8, 101.7))

stock.trades <- stock.trades[order(stock.trades$timestamp),]

library(plyr)
mystock.prices <- ldply(option.trades$timestamp, function(tstamp) tail(subset(stock.trades, timestamp <= tstamp), 1))
names(mystock.prices)[1] <- "stock.timestamp"
myres <- cbind(option.trades, mystock.prices)

This method works. But for large dataframes, it is very slow.  Is there
a way to speed up the merge?

Thanks,
Naresh



More information about the R-help mailing list