0

This is a simple question. Suppose I have a massive table (5 million rows), and I have no option but to do a full table scan.

SELECT * FROM Table1

Is there any way at all to make my database return the results faster?

Background:

We have an application provided to us by a third party which was designed to be robust/flexible - there are very few, large tables in it's database. For example, one of the tables is the "Object" table, which stores all objects. Another table is the "Relationship" table, that captures all relationships between objects. It essentially allows you to store anything without changing its schema.

My task is to design a separate reporting application - an application that queries this database. It has to be live data. And because of the size of the tables, plus the complexity of the queries, there's performance issues.

I was wondering how I should go about handling the performance issues. I've created indexes after indexes, but the queries are still very complex, and at the end of the day, I still have to make multiple full table scans.

The good news is, I'm still at the design phase - so I'd like to hear as many suggestions as possible, in case I've missed something.

painiyff
  • 2,519
  • 6
  • 21
  • 29
  • PS: 5 million is not large – Ed Heal Dec 12 '15 at 00:48
  • @EdHeal ... It is certainly 'large' enough for there to be performance issues on full table scans.... – painiyff Dec 12 '15 at 00:56
  • 1
    RAID 0 some SSD drives! – Neil McGuigan Dec 12 '15 at 00:57
  • @NeilMcGuigan Hmm thought of SSDs but haven't thought of RAID. Thanks. – painiyff Dec 12 '15 at 01:03
  • 3
    What are you doing with the data? If you select everything and return it to a client, it's highly likely that the bottleneck is the network and the client application not the database. If you are processing the data in Oracle to, say, build aggregates or cubes, that can likely be optimized. Otherwise, the question is too broad. You could, for example, buy an Exadata machine. But that's an expensive solution to a rather poorly specified problem. – Justin Cave Dec 12 '15 at 02:12
  • @JustinCave Sorry, I wanted to keep the question open because I wanted to consider as many suggestions as possible, as I'm still in the designing stage of the application. However, I did update the OP with some background. – painiyff Dec 14 '15 at 17:21

2 Answers2

4

There are at least 9 ways to improve full table scans. But improving full table scans may not necessarily improve overall performance depending on exactly what is done with the data. For example, if 5 million rows are being sent to an application, the network will be the bottleneck and improving full table scans won't help.

Some of these methods are extreme, may only help in rare cases, and may only slightly improve performance. Think carefully before you apply esoteric tuning ideas. Make sure that full table scans are the bottleneck before you spend a lot of time doing something that may only improve performance by 1%.

  1. Parallelism SELECT /*+ PARALLEL */ * FROM Table1; Parallelism can easily improve full table scan performance by an order of magnitude on many systems. But there are many limitations - sane configuration, large table size, Enterprise Edition only, etc.
  2. DB_FILE_MULTIBLOCK_READ_COUNT This parameter controls how many blocks are read at a time. For a full table scan, more is usually better. This parameter is complicated, and often misunderstood. As long as it's set to the default then it's probably optimal.
  3. Hardware There are many ways to improve disk performance. SSDs, different RAID options, faster disks, more disks, etc.
  4. Increase Memory Cache more of the table blocks by increasing the amount of memory, specifically the buffer cache, which is part of the SGA. This may mean increasing the size of MEMORY_TARGET or SGA_TARGET.
  5. Shrink the Table If the table is read-only, set it to PCTFREE 0 to save the normal 10% of space that is saved for changes. Also, if the table has changed a lot in the past, ALTER TABLE MY_TABLE MOVE; will re-organize it and potentially fill in some empty space. Shrinking the physical size of the table may make it faster to read.
  6. Reduce Other Activity Check the database, and other databases on the server, for other activity. There may be other processes using up some of the resources.
  7. Compression Some tables can be significantly shrunk using different types of compression. This is a trade-off between I/O and CPU. Normally, the time spent decompressing data is less than the extra time spent to retrieve more blocks.
  8. Uniform Extent Management Auto-allocation, the default, may waste a small bit of space. Using a small, uniform size, may save a small amount of space. Or perhaps a large amount of space if the table has many small partitions. CREATE TABLESPACE ... EXTENT MANAGEMENT LOCAL UNIFORM SIZE 1M;
  9. Increase Block Size In some rare cases a larger block size may improve compression, reduce row chaining, and use less block overhead.
Jon Heller
  • 34,999
  • 6
  • 74
  • 132
  • The answer is good. But I understand that parallelism works on the receiving end of the data. The OP is just reading the table so the hardware is the bottleneck? Or am I incorrect? – Ed Heal Dec 12 '15 at 20:20
  • @EdHeal It's hard to know without more information from the OP. I answered the literal question "How to make full table scans faster in Oracle database?". But it's entirely possible this is an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). It would be odd to *only* read 5 million rows; surely someone or something has to *do* something with the data. Also, parallelism works with retrieving the data as well as processing the data. I've seen full table scans go from 100MB/s to 1GB/s with parallelism. – Jon Heller Dec 12 '15 at 20:28
  • @JonHeller Hmm it could be a XY problem. Sorry I'm new to SO. The original task is to create a live data-reporting application on a existing database. I've update the OP. Basically, I've tried creating indexes, which helped, but I still have to make full table scans at some point. And those seem to be the bottleneck at the moment. – painiyff Dec 14 '15 at 17:27
3

You can use parallel hint to use parallel threads to read the full table faster.

SELECT /*+ parallel(t 4) */ * FROM Table1 t;

On an idle database you can define parallel degree up to number of cpus/cores on each of the instance.

Durga Viswanath Gadiraju
  • 3,896
  • 2
  • 14
  • 21
  • Does this depend on the speed of the hardware storing the data. Is this going to be the bottleneck? – Ed Heal Dec 12 '15 at 15:39
  • It requires multiple cpus/cores. Also data should be on clustered file system to get better performance with multiple network i/o channels. – Durga Viswanath Gadiraju Dec 12 '15 at 16:04
  • I am suggesting that the parallel bit does not matter - CPU power is not the limiting factor. The limiting factor will be retrieving the data from the hard disk. The parallel hint works if you are actually doing some processing of the data. – Ed Heal Dec 12 '15 at 16:07
  • @EdHeal Parallelism can significantly improve I/O throughput in Oracle. It still may not help, if there's another bottleneck like the network. But it's quite common in Oracle data warehouses to use parallelism to improve full table scan performance. – Jon Heller Dec 12 '15 at 18:29
  • @JonHeller - The statement by the OP is just to read the table. I grant you if you are doing something constructive with the data then going parallel is the way to go. But the query in front of us is just reading the data - so the bottle neck will be the transfer of data from the disc. – Ed Heal Dec 12 '15 at 18:31