0

Given we have a big table in a relational database we need to query.

We have two options:

  • query the whole table
  • query subsets of data inside the table i.e. rows from 1 to 1000, then 1001 to 2000 etc.

Does this separation make some sense?

Does it depend on query structure?

Let's add some math. Given some query execution time is proportional to n^3 where n is the number of rows in the table. This means in first case query execution time is proportional to n^3. As for second option - its different. Total time would be (n/3)^3 + (n/3)^3 + (n/3)^3 = n^3 / 9 which is better.

Real life is more complicated: the query would not be the same in this case, we have to spend some time limiting the rows to the subset.

Also number of connections and concurrency of database can be limited thus we would not be able to query it simultaneously by 10 queries, for example, at least with the same speed.

But does these reasons make sense? May this help to cut time expenses for some big tables?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
MiamiBeach
  • 3,261
  • 6
  • 28
  • 54
  • Are you really retrieving the whole table, or applying a condition; and if you're filtering do you have indexes on the columns that you filter? Where are you getting the n^3 figure from? – Alex Poole Sep 23 '15 at 14:12
  • 1
    I don't consider it possible to answer this question as it's so dependent on the data and the setup of your database. The best suggestion I can offer is to try it both ways and go with what works best. Best of luck. – Bob Jarvis - Слава Україні Sep 23 '15 at 14:18
  • Also, your math is off. (n/3)^3 = (n^3)/27. However, I don't see any support for your assertion that "query execution time is proportional to n^3". Where did you get that from? Why is it not "proportional to n" or "n^2" or "e^n" or "n^e"? In addition - how many rows are there in this table? Or, what does "big" mean? – Bob Jarvis - Слава Україні Sep 23 '15 at 14:23
  • 1
    Consider partitioning your table if it makes sense form a business perspective. – HLGEM Sep 23 '15 at 14:28
  • Well, n^3 notation is just an example. Generally we cannot say how the query execution time changes when number of rows grows. For example if we are searching for a row in a indexed table it is log(n), if fetching all the table it is n, I guess there are queries and functions in SQL that may work in n^2 or even n^3 time. I have just shown that for some types of queries this partitioning makes some sense from the perspective of mathematics. – MiamiBeach Sep 23 '15 at 14:48

1 Answers1

0

It depends on a lot of criteria. Some of them being :

  1. How busy the database is ? That is how many parallel query is running ?

    Reason : If there are a large number of query running or any query with a number of parallel session then querying on the big table will be slow while smaller will work faster.

  2. Into how many smaller table the bigger table have been divided into ?

    Reason : One point to consider here that if a big table is divided into several small tables and run the query on each of the smaller tables, then the individual results needed to be aggregated. This may take time depending on the query.

  3. Type of query is being executed

    Reason : If you are running a query having filtering condition on a column and you divide the large table based on values of that column, then you can skip some of the tables based on query condition and hence reduce time of output

Overall in such a scenario instead of dividing a big table into smaller ones it is better to partition the table. Range Partition can be used on the bigger table for faster query execution.

Kapes
  • 64
  • 1
  • 7