I am trying to merge two tables with the conditions "ccode" = "Ticker", "Date"="Date" and "Time"= "Timestamp". However, if there is not an exact match of "Time" it should look at "Timeint" (up to -2 minutes). As this is something I can't do with left_join
- I was thinking about fuzzy_left_join
.
I am encountering two problems:
- My data is stored on bigquery and I can do a
left_join
which works. However, when I try afuzzy_left_join
, I get the following error message:
"Error: All columns in a tibble must be 1d or 2d objects: * Column
col
is NULL
- I tried to make a reproducable sample (which is a bit different as it is not stored as a list of 2) and it also does not work.
Here is a reproducable sample:
library(fuzzyjoin)
library(anytime)
calls.sample <- data.frame(ccode = c("MMM", "K", "A", "CAG", "PM"),
Date = c(20111020, 20111021, 20120102, 20110510, 20080710),
Time = c("09:30:00", "14:30:00", "11:00:00", "15:30:00", "13:00:00"),
Timeint = c("9:28:00", "14:28:00", "10:58:00", "15:28:00", "12:58:00")
)
str(calls.sample$Time)
calls.sample$Time <- as_hms(as.character(calls.sample$Time))
calls.sample$Timeint <- as_hms(as.character(calls.sample$Timeint))
stocks.sample <- data.frame(Ticker = c("MMM", "K", "A", "CAG", "PM"),
Date = c(20111020, 20111021, 20120102, 20110510, 20080710),
Timestamp = c("9:28:00", "14:30:00", "11:00:00", "15:30:00", "13:00:00"),
OpenPrice = c(5, 1,6,7,8))
stocks.sample$Timestamp <- as_hms(as.character(stocks.sample$Timestamp))
fuzzy_left_join(
calls.sample, stocks.sample,
by = c(
"ccode" = "Ticker",
"Date" = "Date",
"Time" = "Timestamp",
"Timeint" = "Timestamp"
),
match_fun = list(`==`, `==`, `<=`, `>=`)
)
I am looking for:
ccode Date Time OpenPrice
1 MMM 20111020 09:30:00 5
2 K 20111021 14:30:00 1
3 A 20120102 11:00:00 6
4 CAG 20110510 15:30:00 7
5 PM 20080710 13:00:00 8
I am grateful for any help on this one :)!