0

I maintain a table in Oracle that contains several hundred thousand lines of code, including a priority column, which indicates for each line its importance according to the needs of the system.

ID BRAND COLOR VALUE SIZE PRIORITY EFFECTIVE_DATE_FROM EFFECTIVE_DATE_FROM
1 BL BLUE 58345 12 1 10/07/2022 NULL
2 TK BLACK 4455 1 1 10/07/2022 NULL
3 TK RED 16358 88 2 11/01/2022 NULL
4 WRA RED 98 10 6 18/07/2022 NULL
5 BL BLUE 20942 18 7 02/06/2022 NULL

At any given moment thousands more rows may enter the table, and it is necessary to SELECT from it the 1000 rows with the highest priority.

Although the naive solution is to SELECT using ORDER BY PRIORITY ASC, we find that the process takes a long time when the table contains a very large amount of rows (say over 2 million records).

The solutions proposed so far are to divide the table into 2 different tables, so that in advance the records with priority 1 will be entered into Table A, and the rest of the records will be entered in Table B, and then we will SELECT using UNION between the two tables. This way we save the ORDER BY process since Table A always contains priority 1, and it remains to sort only the data in Table B, which is expected to be smaller than the overall large table.

On the other hand, it was also suggested to leave the large table in place, and perform the SELECT using PARTITION BY on the priority column.

I searched the web for differences in speed or efficiency between the two options but did not find one, and I am debating how to do it. So which of the options is preferable if we focus on the efficiency and complexity of time?

Zar Tel
  • 19
  • 4
  • This task has one the most problematic case: each identifier may have different "last version" and you generally have no ability to predict which rows will be affected and which not. I do not know any algorithm that may easily solve this issue without change in the loading process for the new data. – astentx Jul 18 '22 at 15:01
  • In the old "good" times we used to use Index Organized Tables for that purpose where you put the Priority as the first element of the primary key. Just do a check if the IOT is still available and analyze the advantages and disadvantages if they fit your application. – Ali Avcı Jul 18 '22 at 15:25
  • `select * from t order by priority fecth next 100 rows only` should be blazing fast with an index on `priority`. Please elaborate on why this solution is not useful to you. – The Impaler Jul 18 '22 at 15:28
  • @TheImpaler. It's simple, I mentioned the way you suggested in the post as the naive solution that exists, however at the same time I also noted that the action takes a long time. Try to do this yourself every 7 seconds on average, with 4-6 million records in your table. – Zar Tel Jul 18 '22 at 17:27
  • 1
    It shouldn't take a long time ever. You are just retrieving 1000 rows every time. Does the table have the index `t (priority)`? That should do it. Please post the execution plan. – The Impaler Jul 18 '22 at 17:43

0 Answers0