1

Situation: I have the exchange rate table like this:

date_from   cur1 coef1          cur2 coef2
2017-01-01  CZK  27.000000000   EUR  1.000000000
2017-07-03  EUR  1.000000000    CZK  26.150000000
2017-07-03  JPY  100.000000000  CZK  19.500000000
2017-10-05  JPY  1000.0000000   EUR  7.54761885

Notice that sometimes the cur1 and cur2 can be switched for the same pair. The table contains also other currency pairs. The reason for the two coefficients is that the table is filled manually (to get the numbers more comprehensible by a human brain -- see the JPY conversion).

Then I have another table with invoice rows where the price is expressed in the local currency (that is, each row has it own currency unit near the price value).

I need to do some SELECT over the invoice-row table and transform the price to be shown in the chosen target currency (say, everything in Euro). How to do that efficiently?

My first attempts: I know the target currency in advance. It means it should probably be better to build a temporary table with simplified structure to be joined easily. Let the target currency be EUR. Then only subset of the above table will be used, some pairs be switched, and the two coefficients be converted to one rate. The target currency will be fixed or implicit. From the above table, the JPY-CZK pair would not be part of the table:

date_from   cur  rate
2017-01-01  CZK  27.000000000
2017-07-03  CZK  26.150000000
2017-10-05  JPY  0.00754761885

To join the rows with another table I need not only the date_from but also date_to. To be able to use BETWEEN in the join condition, I would like to have the date_to as the one just before the next period. Here for CZK, I need to have a record like:

date_from   date_to     cur  rate
2017-01-01  2017-07-02  CZK  27.000000000

Notice the one day off in the date_to from the next date_from.

However, I need to add automatically also some boundary values for the dates before and after the explicitly expressed intervals. I need something like that:

date_from   date_to     cur  rate
1900-01-01  2016-12-31  CZK  27.000000000   <-- guessed rate from the next; fixed start at the year 1900
2017-01-01  2017-07-02  CZK  27.000000000
2017-07-03  3000-01-01  CZK  26.150000000   <-- explicit rate; boundary year to 3000

Plus similarly for the other currencies in the same temporary table...

1900-01-01  2017-10-04  JPY  0.00754761885  <-- rate guessed from the next; fictional date_from
2017-10-05  3000-01-01  JPY  0.00754761885  <-- explicit rate; fictional date_to

How can I efficiently construct such temporary table?

Do you have any other suggestions related to the problem?

Update: I have posted my solution to Code Review https://codereview.stackexchange.com/q/177517/16189 Please, have a look to find the flaws.

pepr
  • 20,112
  • 15
  • 76
  • 139
  • Which [DBMS](https://en.wikipedia.org/wiki/DBMS) product are you using? Postgres? Oracle? "_SQL_" is just a query language, not the name of a specific database product. –  Oct 05 '17 at 14:14
  • I am using MS-SQL if it is important. I though it could be solved using standard SQL. – pepr Oct 05 '17 at 14:34
  • That is a very weird way to store exchange rates. Normally you just have an exchange rate, not two coefficients. Suppose you want the exchange rate for USD to EUR, that would be say `X`. Then if you have 1 USD, you would get `1/X` EUR (yes, division). It is said that 1 EUR = `X` USD. This is how exchange rates are displayed on the exchange markets. In your case, the exchange rate for JPY to EUR would be 132.241, not 0.00756193. – TT. Oct 09 '17 at 15:06
  • Side note: [you shouldn't be using `BETWEEN` with date/time types](https://sqlblog.org/2011/10/19/what-do-between-and-the-devil-have-in-common), or any inclusive upper bound (`<`), especially on SQL Server. It's more understandable if you consider timestamps to have infinite precision: there is no such thing as an "end" instant, only a start instant of the next state, which the current value was less than/before. – Clockwork-Muse Oct 09 '17 at 18:07
  • @TT: This is not so unsual way to store exchange rates when maintained manually (the two coefficients -- like here https://www.cnb.cz/en/financial_markets/foreign_exchange_market/exchange_rate_fixing/daily.jsp). What would be a difference when dividing by 1/x or multiplying by x? – pepr Oct 10 '17 at 06:32
  • @pepr If you retrieve exchange rates from e.g. Bloomberg or Reuters, you get those the way I described. That's how the markets have these rates. Why would you then store them any other way? I suppose you could maintain a second coefficient that is always 1... but what's the point? Just a waste of space. Perhaps for things like Pence, it is expressed with a coefficient maybe. I'd have to check that out. But then you could factor that to just get one coefficient (i.e. just get one exchange rate). – TT. Oct 10 '17 at 06:33
  • @TT: When you get the the rates from other sources, they do it differently. I do not know why. There are also cases when they show both x and 1/x. Another reason is that the target currency will not be fixed. This way, all currencies are a kind of equal. The second coefficient is not always 1. It can be 100, 1000. – pepr Oct 10 '17 at 06:36
  • @Clockwork-Muse: I will actually work with days that are perfectly integer. – pepr Oct 10 '17 at 06:37
  • I write software for stock brokers and asset managers, and the people in that market are always used to exchange rates (i.e. one number) that is used a divisor to get to the other currency i.e. the exchange rate for USD to EUR would be X, then if you have Y USD you'd get Y/X EUR. It's counterintuitive at first that's for sure. But if that's the information you get, you'd have to convert it to 1/X if you want it the other way around. And then if you have to display to users, you'd have to do the reverse and rounding errors might show a different result. – TT. Oct 10 '17 at 06:40
  • @TT: Actually, that table with two coefficients already exists. My task is to process it to be usefull for conversion of currencies from invoices. Possibly, it is more intuitive to have 100 JPY = 19.5 CZK than a single number. They are stored as decimals (no information loss). I have added a link to Code Review with my solution. – pepr Oct 10 '17 at 06:52
  • In any case, whether you want to maintain a multiplicative or divisive rate, I'd add a computed column that gives a single rate, rather than two coefficients. Otherwise, using the rate table gets awkward having to take two coefficients into account each time. – TT. Oct 10 '17 at 06:52
  • @TT: I think we agree each with the other. The exchange rate table with two coefficients is updated manually. The goal is to get another table with `date_from` and `date_to`. See the link mentioned in update. – pepr Oct 10 '17 at 06:56
  • I'll add my temp table solution to my answer again. If you do have a lot of rates and a lot of invoices, it might improve execution time. – TT. Oct 10 '17 at 06:57
  • @TT: Please, comment directly on https://codereview.stackexchange.com/q/177517/16189 It already uses the temporary table. – pepr Oct 10 '17 at 07:02
  • Added it. Tested on a test database with 136772 rates and 95000 invoices, it did improve a bit with a temp table: 3 seconds for temp table, 9 seconds for correlated subquery. – TT. Oct 10 '17 at 07:03

2 Answers2

1

I don't think that you will need a temporary table.

You first need to get the rates that have the highest date_from value for every invoice. That is simply a MAX on date_from with the limitation of the rate's date being smaller than the invoice's date. For the example I used CZK as the currency to convert to:

SELECT 
  invoices.id
  , invoices.cur
  , MAX(date_from) AS current
FROM invoices
JOIN rates
ON rates.cur1 = invoices.cur
AND invoices.date > rates.date_from
AND rates.cur2 = 'CZK'
GROUP BY invoices.id, invoices.cur, invoices.date

Because of the limitations on columns available for SELECT caused by the GROUP BY we now have to join the two tables again and then join it with our effort of getting the current rate:

SELECT 
  invoices.id
  , invoices.cur
  , invoices.amount
  , 'CZK' AS otherCurrency
  , invoices.amount / rates.coef1 * rates.coef2 AS converted
FROM invoices
JOIN
  (SELECT 
    invoices.id
    , invoices.cur
    , MAX(date_from) AS current
  FROM invoices
  JOIN rates
  ON rates.cur1 = invoices.cur
  AND invoices.date > rates.date_from
  AND rates.cur2 = 'CZK'
  GROUP BY invoices.id, invoices.cur, invoices.date) AS current_rate
ON invoices.id = current_rate.id
JOIN rates
ON current_rate.current = rates.date_from
AND rates.cur1 = invoices.cur
AND rates.cur2 = 'CZK'

I prepared a fiddle to show the SQL in action.

ulferts
  • 2,187
  • 12
  • 19
  • Thanks for the solution. The problem is that there is a lot of invoices, and more records for one currency in the rates table. Then join/group have to process a lot of data only to discover the rate. On the other hand, the rate table is updated manually say once a week for one currency. Because of that I am thinking about some optimization that would preprocess the rate table, and then a simpler join with invoices would be done. – pepr Oct 06 '17 at 06:51
1

Suppose the exchange rate table is as follows, with exchange rates to your target currency:

CREATE TABLE currency_rate (
    currency_id INT NOT NULL,
    update_date DATE NOT NULL,
    rate DECIMAL(18,6) NOT NULL,
    CONSTRAINT PK_currency_rate PRIMARY KEY(currency_id,update_date)
);

You can use a correlated subquery to link invoices to the exchange rate:

SELECT
    i.*,
    cr.rate
FROM
    invoice AS i
    INNER JOIN currency_rate AS cr ON
        cr.currency_id=i.currency_id AND
        cr.update_date=(
            SELECT
                MAX(cr_i.update_date)
            FROM
                currency_rate AS cr_i
            WHERE
                cr_i.currency_id=i.currency_id AND
                cr_i.update_date<=i.invoice_date
        );

If you do have a lot of invoices and a lot of rates, a solution based on a temporary table might improve performance. Best to measure which one wins. Based on the same currency_rate table definition:

CREATE TABLE #cr (
    date_from DATETIME,
    date_to DATETIME,
    currency_id INT,
    rate DECIMAL(18,6)
);
CREATE CLUSTERED INDEX IX_tcr_curr_dt ON #cr(currency_id,date_from);

INSERT INTO #cr (
    date_from,
    date_to,
    currency_id,
    rate
)
SELECT
    date_from=ISNULL(DATEADD(DAY,1,LAG(update_date) OVER (PARTITION BY currency_id ORDER BY update_date)), '17530101'),
    date_to=CASE WHEN LEAD(update_date) OVER (PARTITION BY currency_id ORDER BY update_date) IS NULL THEN '99991231' ELSE update_date END,
    currency_id,
    rate
FROM
    currency_rate AS cr;

SELECT
    i.*,
    c.rate
FROM
    invoices AS i
    INNER JOIN #cr AS c ON 
        c.currency_id=i.currency_id AND 
        c.date_from<=i.invoice_date AND 
        c.date_to>=i.invoice_date;

DROP TABLE #cr;
TT.
  • 15,774
  • 6
  • 47
  • 88
  • Thanks for teaching me `LAG` and `LEAD`. I did not know about that. – pepr Oct 10 '17 at 07:11
  • You're quite welcome. I'll check your question on CR later today. – TT. Oct 10 '17 at 07:12
  • I need `date_from` the same as in the original table. So, I let it be, and I have modified the `date_to = DATEADD(DAY, -1, COALESCE(LEAD(date_from) OVER (PARTITION BY currency_id ORDER BY date_from), '3000-01-01')). This way I do not need to self join the table. On the other hand, I still need to add the row that covers the period before the first explicit exchange rate record. – pepr Oct 10 '17 at 11:05
  • 1
    @pepr I suppose you can do that with a `UNION ALL` for the period before the first explicit exchange record. – TT. Oct 10 '17 at 11:19
  • I did another INSERT for the first record of each currency. However, I was not able to do that without using cte. See the updated https://codereview.stackexchange.com/q/177517/16189 – pepr Oct 10 '17 at 11:30
  • 1
    @pepr I'll take a look at this evening if I get to it, but just glancing at it you could have determined the minimum dates for each currency (`SELECT currency,MIN(date_from) ... GROUP BY currency`) in a derived table and link that to your currency rate table. – TT. Oct 10 '17 at 11:45
  • Thanks for the hint. I did use cte anyway -- but differently than before. It looks much cleaner now. – pepr Oct 10 '17 at 13:30
  • @pepr Good to hear. As I said, I will visit your question on Code Review later and give some more remarks if relevant. – TT. Oct 10 '17 at 13:32