0

In BODS, can we create index on a table while data is being read from it?

Actually I am using an ETL tool to load data to target. The performance is very slow and I do not want the data load to stop.

Will the performance improve if I create the index while the data load is in progress?

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
Ashutosh Rai
  • 123
  • 9
  • 1
    What is your DBMS and what you expect from indexing? In general it should not be any issues of index creation while data is read from table since reading data generates no additional data or heavy locks on data. What about performance: why you expect it to be improved by indexing? You didn't provided any model info and bottlenecks of your code – astentx Oct 27 '20 at 20:51
  • SO will it improve the performance once the index is created? @astentx – Ashutosh Rai Oct 27 '20 at 20:57
  • 1
    Random index creation can have any impact on performance: it may degrade or improve it, no one can guess. If it always improve performance, DBMS developers would write the code to always create it by default, but it is not. – astentx Oct 27 '20 at 21:04

1 Answers1

1

In an analytical DB you basically have 2 use cases for indexes:

  1. To speed up data loads
  2. To speed up queries

The more indexes you have on a table the slower it will be to load. Therefore, traditionally, "query" indexes were dropped at the start of each data load and re-built once the load was completed. This is still good practice, where possible, but obviously if you have massive tables (and therefore excessive index re-build times), users running queries during data loads or continuous/streaming loading then this is not possible.

Creating indexes while loading data is likely to slow down the data load - not necessarily because of any impact on the tables being indexed but because indexing uses DB resources and therefore those resources are not available for use by the data loading activities.

Creating an index on a table while that table is being loaded will not speed up the data load and will probably slow it down (see previous paragraph). When SQL is executed on a DB one of the first thing the DB will do is generate an execution plan i.e. determine the most efficient way to execute the statement based on table statistics, available indexes, etc. Once the SQL statement is executing (based on the plan), it will not then continuously check if the indexes have changed since the execution started, re-calculate the plan and re-execute the statement if there is now a more efficient plan available.

Hope this helps? Please tick this answer if it does

NickW
  • 8,430
  • 2
  • 6
  • 19