Re the solution to this question, wouldn't the correlated subquery be executed many times? Can you guys think of any other query that would perform better on really big datasets?
Rather than using joins with corralated subquery, maybe by leveraging LAG/LEAD functions? I'm just wondering.. I would appreciate any input or idea to solve some performance issue I'm facing with similar problem.. I've been searching the net and already tried various solutions like using row_number() for instance but found nothing that would perform any better so far..
CREATE TABLE EXCHANGE_RATES (
VALID_FROM date,
RATE decimal
);
CREATE TABLE ENTRIES (
ID int NOT NULL,
DATE date,
PRIMARY KEY (ID)
);
INSERT INTO EXCHANGE_RATES VALUES
('2020-01-01', 101)
--2020-01-02 missing
,('2020-01-03', 103)
,('2020-01-04', 104)
,('2020-01-05', 105)
--2020-01-06 missing
--2020-01-07 missing
--2020-01-08 missing
,('2020-01-09', 109)
;
INSERT INTO ENTRIES VALUES
(1, '2020-01-02')
,(2, '2020-01-03')
,(3, '2020-01-04')
,(4, '2020-01-08')
;
And the current solution:
SELECT
e.*
,r.RATE
FROM
ENTRIES e
LEFT JOIN EXCHANGE_RATES r
ON r.VALID_FROM = (select max(VALID_FROM) from EXCHANGE_RATES where VALID_FROM <= e.DATE)
;
Results are fine just looking for a better way if any:
ID DATE RATE
1 2020-01-02 101
2 2020-01-03 103
3 2020-01-04 104
4 2020-01-08 105