1

I have one table which has 100 partitions, And I have procedure which reads the data partition wise. Suppose

P0 is 1 partition if I execute procedure

 exec xyz('P0'); 

it takes 3 minute to execute for 500k rows.

But when I create 100 sessions and run parallely something like in different different sessions

exec xyz('P1');
exec xyz('P2');
exec xyz('P3');

It takes 1 hr for 50M rows can anyone explain why its happening.

SDsolar
  • 2,485
  • 3
  • 22
  • 32
Sparsh
  • 39
  • 6
  • There are no concrete information about your actual scenario, so it is not really possible to know the answer. However, I still like the question, as it is modified generally, seeks the truth in a wider range than the actual question. – Lajos Arpad Feb 24 '17 at 07:55

1 Answers1

1

As far as I understand, your expectation is that the time needed for 100 partitions would be 100 * the time needed for a single partition, which is 100 * 3 minutes = 300 minutes = 5 hours. Instead of that, the whole task takes only 1 hour and 50 minutes.

The first question is whether the size of the partitions is equal. Maybe there are small partitions, which could explain the behavior you just described.

Also, the execution time on a given partition highly depends on what your stored procedure actually doing. For instance it might contain large selections, filling a lot of data into memory (which takes a lot of time) in some partitions, while the other partitions are executing it in a much quicker fashion, as records are filtered out before writing their content into memory.

Also, the stored procedure might contain write operations (insert, update, delete), which are very time consuming, especially if they affect many records. There is a high chance that the affected records' distribution among your partitions is uneven and therefore the execution time in different partitions might be different.

For example, if you have roughly 500k records in each partition and your stored procedure selects records by a criteria and does a given update for the items matching the criteria, then there is a high chance that the number of found and affected records will be different in the different partitions.

Or the partitions are distributed among servers with different lags or different properties, which affects the total time. Or the servers are doing some other jobs as well, which will increase execution time in a pseudo-random manner.

Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
  • My expectation is this must be complete in 3 minutes as 1 takes 3 minutes to complete one partition and I run all 100 by parallel different session. And all partition are almost of equal size. – Sparsh Feb 24 '17 at 09:06
  • @Sparsh It is not realistic to expect performance to improve 100x when running 100 threads. I'm not sure what the final number should be, but don't ever expect X threads to deliver X times the performance. How are you running the parallel jobs? If you're using DBMS_SCHEDULER, it's possible you're being limited by the JOB_QUEUE_PROCESSES parameter, or the way Oracle will frequently decrease the scheduled jobs depending on load. If performance is the highest priority, you generally want to use Oracle parallel statements and let Oracle do the parallelism, instead of creating your own jobs. – Jon Heller Feb 24 '17 at 20:38
  • @Sparsh I misunderstood the question. I thought you are running the stored procedure sequentially for some reason. You can expect better performance if you are running the jobs parallelly than sequentially, however, so many threads running at the very same time increases server burden greatly. Your assumption would be more-or-less valid if this would be executed on 100 different computers. Since it is highly probable that your case is different, hence the behavior you have experienced. – Lajos Arpad Feb 25 '17 at 09:35
  • FOR 50M rows sequentially takes 3hr and parallel 1hr, but if I run only for 1 partition it takes only 3 minutes, so I thought if I create 100 session and run 100 procedure at a time by passing different partition id , it won't take much time but in that case it takes 1hr. – Sparsh Mar 09 '17 at 10:54
  • @Sparsh the machine has to do a lot more jobs, it will not become magically much quicker if you do the job in separate threads. There will be some benefits in performance indeed, but you will never be even close to the case when it had to do 100x less work. – Lajos Arpad Mar 09 '17 at 15:59