I am searching the most efficient way to make a relatively complicated query in a relatively large table.
The concept is that:
- I have a table that holds records of phases that can run parallel to each other
- The amount of records exceeds the 5 millions (and increases)
- The time period starts about 5 years ago
- Due to performance reasons, this select could be applied on the last 3 months period of time with 300.000 records (only if it is not physically possible to do it for the whole table)
- Oracle version: 11g
The data sample seems as following
Table Phases (ID, START_TS, END_TS, PRIO)
1 10:00:00 10:20:10 10
2 10:05:00 10:10:00 11
3 10:05:20 10:15:00 9
4 10:16:00 10:25:00 8
5 10:24:00 10:45:15 1
6 10:26:00 10:30:00 10
7 10:27:00 10:35:00 15
8 10:34:00 10:50:00 5
9 10:50:00 10:55:00 20
10 10:55:00 11:00:00 15
Above you can see how the information is currently stored (of course there are several other columns with irrelevant information).
There are two requirements (or problems to be solved)
- If we sum the duration of all the phases, the result is MUCH more than an hour that the above data represent. (There could be holes between the phases, so taking the first start_ts and the last end_ts would not be sufficient).
- The data should be displayed in a form that it would be visible which phases run parallel with which and which phase had the highest priority at each time, as shown in the expected view below
Here it is easy to distinct the highest priority phase at each time (HIGHEST_PRIO), and adding their duration would result the actual total duration.
View V_Parallel_Phases (ID, START_TS, END_TS, PRIO, HIGHEST_PRIO)
-> Optional Columns: Part_of_ID / Runs_Parallel
1 10:00:00 10:05:20 10 True (--> Part_1 / False)
1 10:05:20 10:15:00 10 False (--> Part_2 / True)
2 10:05:00 10:10:00 11 False (--> Part_1 / True)
3 10:05:20 10:15:00 9 True (--> Part_1 / True)
1 10:15:00 10:16:00 10 True (--> Part_3 / True)
1 10:16:00 10:20:10 10 False (--> Part_4 / True)
4 10:16:00 10:24:00 8 True (--> Part_1 / True)
4 10:24:00 10:25:00 8 False (--> Part_2 / True)
5 10:24:00 10:45:15 1 True (--> Part_1 / True)
6 10:26:00 10:30:00 10 False (--> Part_1 / True)
7 10:27:00 10:35:00 15 False (--> Part_1 / True)
8 10:34:00 10:45:15 5 False (--> Part_1 / True)
8 10:45:15 10:50:00 5 True (--> Part_2 / True)
9 10:50:00 10:55:00 20 True (--> Part_2 / False)
10 10:55:00 11:00:00 15 True (--> Part_2 / False)
Unfortunately I am not aware of an efficient way to make this query. The current solution was to make the above calculations programmatically in the tool that generates a large report but it was a total failure. From the 30 seconds that were needed before this calculations, now it needs over 10 minutes without taking event into consideration the priorities of the phases..
Then I thought of translating this code into sql in either: a) a view b) a materialized view c) a table that I would fill with a procedure once in a while (depending on the required duration).
PS: I am aware that oracle has some analytical functions that can handle complicated queries but I am not aware of which could actually help me in the current problem.
Thank you in advance!