2

i have a table "variables" and a table "variables_history", as following

create table variables
( 
    variables_id number,
    variables_name varchar2(50),
    variables_value varchar2(50),
    variables_updated_at timestamp
);


create table variables_history
( 
    variables_id number,
    variables_name varchar2(50),
    variables_value varchar2(50),
    variables_hist_updated_at timestamp
);

The history records are generated by a trigger as following.

CREATE OR REPLACE EDITIONABLE TRIGGER "myuser"."trigger_variables_update" 
  AFTER UPDATE ON myuser.variables
  REFERENCING NEW AS NEW OLD AS OLD
  FOR EACH ROW

BEGIN
  IF :old.variables_value <> :new.variables_value THEN
    INSERT INTO myuser.variables_history
      (variables_id,
       variables_name,
       variables_value,
       variables_hist_updated_at
    VALUES
      (variables_id,
       :old.variables_name,
       :old.variables_value,
       old.variables_updated_at);
  END IF;
END trigger_variables_update;

I also have a table with all the maintenances

create table maintenance
( 
    maintenance_id number,
    maintenance_status varchar2(20),
    maintenance_date timestamp
);

i need to generate a output with the maintenance_price based on the variables at the maintenance_date and variables_updated_at or variables_hist_updated_at

like this


WITH variables_data as
( SELECT 1 variables_id, 'maintenance_price' variables_name, '30.00' variables_value, '2020-08-01 05:00:00.000' variables_updated_at from dual),

variables_history_data as
(
SELECT 1 variables_id, 'maintenance_price' variables_name, '15.90' variables_value, '2019-10-01 11:30:00.000' variables_hist_updated_at from dual union all
SELECT 1 variables_id, 'maintenance_price' variables_name, '10.50' variables_value, '2020-01-01 01:00:00.000' variables_hist_updated_at from dual union all
SELECT 1 variables_id, 'maintenance_price' variables_name, '20.30' variables_value, '2020-05-01 12:30:00.000' variables_hist_updated_at from dual
),

maintenance_data as
(
SELECT 1 maintenance_id, 'COMPLETE' maintenance_status, '2019-02-01 00:30:00.000' maintenance_date from dual union all
SELECT 2 maintenance_id, 'COMPLETE' maintenance_status, '2019-05-01 01:30:00.000' maintenance_date from dual union all
SELECT 3 maintenance_id, 'COMPLETE' maintenance_status, '2019-11-01 02:30:00.000' maintenance_date from dual union all
SELECT 4 maintenance_id, 'COMPLETE' maintenance_status, '2020-07-10 05:30:00.000' maintenance_date from dual union all
SELECT 5 maintenance_id, 'FAILED' maintenance_status, '2020-08-02 11:30:00.000' maintenance_date from dual
SELECT 6 maintenance_id, 'COMPLETE' maintenance_status, '2020-08-20 11:30:00.000' maintenance_date from dual
)

Select 
    m.maintenance_id,
    to_char(m.maintenance_date, 'yyyy/mm/dd') as maintenance_date
    v.variables_value
from 
    maintenances m
    join variables v on m.maintenance_date >= v.variables_updated_at
    join variables_history vh on m.maintenance_date < variables_hist_updated_at
        where maintenance_status = 'COMPLETE';

This query is just a example, i know its wrong

I need the ouput be like this (and consider that the variable may have a new update). The "variable_value" needs to be value at the time the maintenance was generated.

maintenance_id | maintenance_date | variables_value |
---------------+------------------+-----------------+
            1  |        2019-02-01|           15.90 |
---------------+------------------+-----------------+
            2  |        2019-05-01|           15.90 |
---------------+------------------+-----------------+
            3  |        2019-11-01|           10.50 |
---------------+------------------+-----------------+
            4  |        2020-07-10|           20.30 |
---------------+------------------+-----------------+
            6  |        2020-08-20|           30.00 |
---------------+------------------+-----------------+
Malkath
  • 77
  • 3
  • 13
  • Your data model makes this unnecessarily difficult. Your history table should include the timestamp when that history record became effective. – Matthew McPeak Aug 27 '20 at 20:29
  • for that i need to create a new timestamp column on "variables_history" table (like "variables_created_at") and change de trigger to save the :old.variables_updated_at value? – Malkath Aug 28 '20 at 14:32
  • Or maybe just switch to the trigger to save the ":old.variables_updated_at" value at instead of set a current "systimestamp"? – Malkath Aug 28 '20 at 14:37
  • I was thinking to switch the trigger to save `:old.variables_updated_at` instead of `systimestamp`. It doesn't hurt to have both columns, if you want to have 'effective_from" (`:old.variables_updated_at`) and "effective_to" (`systimestamp`). But you should at least have the effective from date in the history table to make this easier. – Matthew McPeak Aug 28 '20 at 15:12
  • Thank you. I'll try that way. It's an old system, a little complicated to handle. – Malkath Aug 28 '20 at 19:36
  • I changed the trigger to save 'old.variables_updated_at', but now i need help to adjust the sql :D – Malkath Oct 19 '20 at 19:50
  • @RafaelFrancisco: you should not be changing the question after answers were posted, and accepted. Putting a bounty on it does not make it more legitimate. Instead, you should be asking a new question, describing your new use case and providing proper sample data and desired results. – GMB Oct 20 '20 at 00:12
  • What you try to implement is called a *Slowly Changing Dimension Type 2* (in Standard SQL there's a feature named *Temporal Table*). It's usually implemented using a pair of *start/end date*, Loading gets more complex, but then it's a simple `col between start_date and end_date` – dnoeth Oct 22 '20 at 13:45

2 Answers2

1

As I understand your data (and with the help of Matthew McPeak), the history table stores the date when a price became obsolete, while, on the other hand, the "live" table stores the date when it became effective.

You could approach this with two lateral joins:

select 
    m.maintenance_id,
    to_char(m.maintenance_date, 'yyyy/mm/dd') as maintenance_date,
    v.*,
    vh.*,
    coalesce(v.variables_value, vh.variables_value) as variables_value
from maintenances m
outer apply(
    select v.variables_value
    from variables_data v
    where v.variables_updated_at <= m.maintenance_date
) v
outer apply (
    select vh.variables_value
    from variables_history_data vh
    where vh.variables_hist_updated_at > m.maintenance_date
    order by vh.variables_hist_updated_at
    fetch first 1 row only
) vh
where m.maintenance_status = 'COMPLETE'
order by 1;

For your sample data, the query returns:

MAINTENANCE_ID | MAINTENANCE_DATE | VARIABLES_VALUE
-------------: | :--------------- | --------------:
             1 | 2019/02/01       |            15.9
             2 | 2019/05/01       |            15.9
             3 | 2019/11/01       |            10.5
             4 | 2020/07/10       |            null
             6 | 2020/08/20       |              30

Note that there is a glitch in your sample data. There should be a row in the history table with an timestamp that corresponds to the current timestamp of the live data - because of this, maintenance_id 4 does not match on anything.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Great answer, except the OP's data model is screwy. If I update a value from 100 to 200 at 11am, I am getting, presumably, a row in the history table saying value = 100 updated at 11am and a row in the main table saying the value is 200 updated at 11am. If that's a correct understanding of what's going on, I'm not sure it's quite captured in your post, though I completely agree on the `CROSS APPLY` approach. – Matthew McPeak Aug 27 '20 at 17:22
  • 1
    @MatthewMcPeak: ah yes you are right... I had that in the back of my mind when I started answering but I lost it somewhere. I changed the the query a little to accommodate that. Now I am starting to think that using `outer apply` would be as good, or better. – GMB Aug 27 '20 at 19:31
  • Still not there I think (will return latest historical record instead of current record for maintenance dates after the last change). I added a comment for the OP about his data model. This shouldn't be this hard. – Matthew McPeak Aug 27 '20 at 20:31
  • 1
    @MatthewMcPeak: you are right again! I went for 2 lateral joins instead (and added a fiddle...). Thanks! – GMB Aug 27 '20 at 20:54
  • @MatthewMcPeak I changed the trigger to save 'old.variables_updated_at', but now i need help to adjust the sql :D – Malkath Oct 19 '20 at 19:51
  • 1
    @RafaelFrancisco This isn't a free code-writing service. If you have a particular question that is arising when you try to adjust the SQL, I suggest you post it as a new question with data model, sample data, what you tried, and what your exact question is. – Matthew McPeak Oct 19 '20 at 19:58
0

You can use UNION ALL to combine the tables and then use LAG/LEAD to find the most recent (or next) value:

SELECT *
FROM   (
  SELECT maintenance_id,
         COALESCE(
           variables_value,
           LAG( variables_value, 1 ) IGNORE NULLS OVER ( ORDER BY maintenance_date ),
           LEAD( variables_value, 1 ) IGNORE NULLS OVER ( ORDER BY maintenance_date )
         ) AS variables_value,
         maintenance_date
  FROM   (
    SELECT maintenance_id,
           NULL AS variables_value,
           maintenance_date
    FROM   maintenances
    WHERE  maintenance_status = 'COMPLETE'
    UNION ALL
    SELECT NULL,
           variables_value,
           variables_updated_at
    FROM   variables
    UNION ALL
    SELECT NULL,
           variables_value,
           variables_hist_updated_at
    FROM   variables_history
  )
)
WHERE maintenance_id IS NOT NULL;

Which, for your sample data:

CREATE TABLE variables ( variables_id, variables_name, variables_value, variables_updated_at ) as
SELECT 1, 'maintenance_price', 30.00, TIMESTAMP '2020-08-01 05:00:00.000' from dual;

CREATE TABLE variables_history ( variables_id, variables_name, variables_value, variables_hist_updated_at ) as
SELECT 1, 'maintenance_price', 15.90, TIMESTAMP '2019-10-01 11:30:00.000' from dual union all
SELECT 1, 'maintenance_price', 10.50, TIMESTAMP '2020-01-01 01:00:00.000' from dual union all
SELECT 1, 'maintenance_price', 20.30, TIMESTAMP '2020-05-01 12:30:00.000' from dual;

CREATE TABLE maintenances ( maintenance_id, maintenance_status, maintenance_date ) as
SELECT 1, 'COMPLETE', TIMESTAMP '2019-02-01 00:30:00.000' from dual union all
SELECT 2, 'COMPLETE', TIMESTAMP '2019-05-01 01:30:00.000' from dual union all
SELECT 3, 'COMPLETE', TIMESTAMP '2019-11-01 02:30:00.000' from dual union all
SELECT 4, 'COMPLETE', TIMESTAMP '2020-07-10 05:30:00.000' from dual union all
SELECT 5, 'FAILED',   TIMESTAMP '2020-08-02 11:30:00.000' from dual union all
SELECT 6, 'COMPLETE', TIMESTAMP '2020-08-20 11:30:00.000' from dual;

Outputs:

MAINTENANCE_ID | VARIABLES_VALUE | MAINTENANCE_DATE            
-------------: | --------------: | :---------------------------
             1 |            15.9 | 01-FEB-19 00.30.00.000000000
             2 |            15.9 | 01-MAY-19 01.30.00.000000000
             3 |            15.9 | 01-NOV-19 02.30.00.000000000
             4 |            20.3 | 10-JUL-20 05.30.00.000000000
             6 |              30 | 20-AUG-20 11.30.00.000000000

If you swap LAG and LEAD (to prefer the next value rather than the previous):

SELECT *
FROM   (
  SELECT maintenance_id,
         COALESCE(
           variables_value,
           LEAD( variables_value, 1 ) IGNORE NULLS OVER ( ORDER BY maintenance_date ),
           LAG( variables_value, 1 ) IGNORE NULLS OVER ( ORDER BY maintenance_date )
         ) AS variables_value,
         maintenance_date
  FROM   (
    SELECT maintenance_id,
           NULL AS variables_value,
           maintenance_date
    FROM   maintenances
    WHERE  maintenance_status = 'COMPLETE'
    UNION ALL
    SELECT NULL,
           variables_value,
           variables_updated_at
    FROM   variables
    UNION ALL
    SELECT NULL,
           variables_value,
           variables_hist_updated_at
    FROM   variables_history
  )
)
WHERE maintenance_id IS NOT NULL;

Then the output is:

MAINTENANCE_ID | VARIABLES_VALUE | MAINTENANCE_DATE            
-------------: | --------------: | :---------------------------
             1 |            15.9 | 01-FEB-19 00.30.00.000000000
             2 |            15.9 | 01-MAY-19 01.30.00.000000000
             3 |            10.5 | 01-NOV-19 02.30.00.000000000
             4 |              30 | 10-JUL-20 05.30.00.000000000
             6 |              30 | 20-AUG-20 11.30.00.000000000

db<>fiddle here

MT0
  • 143,790
  • 11
  • 59
  • 117