0

I have a same query running on two different DB servers with almost identical config. Query is doing Full Table scan(FTS) on one table

SELECT COUNT (1) FROM tax_proposal_dtl WHERE tax_proposal_no = :b1 AND taxid != :b2 AND INSTR(:b3 , ',' || STATUS || ',' ) > 0 ;

While on 1st DB I get result in less than 3 secs with 0 disk read while on 2nd DB disk read is high and elapsed time is approx 9 secs

Only difference between the table config on two DBs is that on 1st Table has Cache = 'Y' while on 2nd Cache = 'N'. As per my understanding is that in case of FTS cache wont be used and direct path read will be used. so, why is the performance of same query is impacted by cache/nocache(Because that is the only difference between two envs and even the execution plan is same).

As suggested by Jon and after doing further research on this topic(Specially with regards to _SMALL_TABLE_THRESHOLD), I am adding more details.

Current version: Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit

Details of 2nd DB:

memory details of 2nd DB with _STT

Total block count of table from DBA_SEGMENTS = 196736

Details of 1st DB:

memory details of 1st DB with _STT

Total block count of table from DBA_SEGMENTS = 172288

Execution plan on both the DBs are same but there are two major differences : a) On 2nd DB cache option is false on the table(I tried alter table cache but still no impact on performance)

b) On 2nd DB because _STT parameter is 23920 so as per 5*_STT rule table will not be qualified as medium sized table while on 1st DB _STT parameter is 48496 so as per 5*_STT rue table will be qualified as medium sized table.

Below is a chart based on my research till now on _STT an Cache parameter of how system will behave for different table size.

Chart to describe table size, Cache and _STT parameter impact

Please let me know if my understanding is correct in assuming that Cache option will have no impact on Medium or Large sized table but it will help in retaining small sized table longer in LRU. So based on above assumptions and chart presented I am concluding that in the case of 2nd DB Table is classified as Large sized table and hence DPR and more elapsed time while in the case of 1st it is classified as medium sized table and hence cache read and less elapsed time.

As per this link I have set the _STT parameter on session on 2nd DB

alter session set "_small_table_threshold"=300000;

So, performance has improved considerably and almost same as 1st DB with 0 disk reads, as this implies that table will be considered Small sized.

I have used following articles in my research.

https://jonathanlewis.wordpress.com/2011/03/24/small-tables/

https://hoopercharles.wordpress.com/2010/06/17/_small_table_threshold-parameter-and-buffer-cache-what-is-wrong-with-this-quote/?unapproved=43522&moderation-hash=be8d35c5530411ff0ca96388a6fa8099#comment-43522

https://dioncho.wordpress.com/tag/full-table-scan/

https://mikesmithers.wordpress.com/2016/06/23/oracle-pinning-table-data-in-the-buffer-cache/

http://afatkulin.blogspot.com/2012/07/serial-direct-path-reads-in-11gr2-and.html

http://afatkulin.blogspot.com/2009/01/11g-adaptive-direct-path-reads-what-is.html

Atif
  • 2,011
  • 9
  • 23

1 Answers1

1

The keywords CACHE and NOCACHE are a bit misleading - they don't simply enable or disable caching, they only make cache reads more or less likely by changing how the data is stored in the cache. Like most memory systems, the Oracle buffer cache is constantly adding new data and aging out old data. The default, NOCACHE, will still add table data from full table scans to the buffer cache, but it will mark it as the first piece of data to age out.

According to the SQL Language Reference:

CACHE

For data that is accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the most recently used end of the least recently used (LRU) list in the buffer cache when a full table scan is performed. This attribute is useful for small lookup tables.

...

NOCACHE

For data that is not accessed frequently, this clause indicates that the blocks retrieved for this table are placed at the least recently used end of the LRU list in the buffer cache when a full table scan is performed. NOCACHE is the default for LOB storage.

The real behavior can be much more complicated. The in-memory option, result caching, OS and SAN caching, direct path reads (usually for parallelism), the small table threshold (where Oracle doesn't cache the whole table if it exceeds a threshold), and probably other features I can't think of may affect how data is cached and read.


Edit: I'm not sure if I can add much to your analysis. There's not a lot of official documentation around these thresholds and table scan types. Looks like you know as much about the subject as anyone else.

I would caution that this kind of full table scan optimization should only be needed in rare situations. Why is a query frequently doing a full table scan of a 1GB table? Isn't there an index or a materialized view that could help instead? Or maybe you just need to add more memory if you need the development environment to match production.

Another option, instead of changing the small table threshold, is to change the perceived size of the table. Modify the statistics so that Oracle thinks the table is small. This way no other tables are affected.

begin
    dbms_stats.set_table_stats(ownname => user, tabname => 'TAX_PROPOSAL_DTL', numblks => 999);
    dbms_stats.lock_table_stats(ownname => user, tabname => 'TAX_PROPOSAL_DTL');
end;
/
Jon Heller
  • 34,999
  • 6
  • 74
  • 132
  • Thanks Jon. But as per DBAs everything is same on both the env. So I am unable to understand why on 1st disk read is 0 while on 2nd disk read is high and ultimately elapsed time is high. Do you know how can i check about this difference and what can be possible reason ? – Atif Jun 09 '20 at 04:46
  • But didn't you say the CACHE settings were different between the databases? That table setting could explain the difference. A difficulty with caching is that patterns of use can affect performance. One database may be running other queries that read from the same table, keeping it in the cache, and thus indirectly improving the query you care about. You might be able to track down related queries in views like `GV$ACTIVE_SESSION_HISTORY`, but that can be difficult. – Jon Heller Jun 09 '20 at 05:12
  • But as per my recent understanding of the topic cache from FTS is also depended on _small_table_threshold so if the number of blocks in big table is greater than _small_table_threshold then FTS will do direct read and will not use. So if Cache is 'Y' and number of blocks are less than _small_table_threshold then blocks will be stored in buffer cache at MRU end of LRU of buffer cache in the case of FTS. – Atif Jun 09 '20 at 08:02
  • But if Cache is N and number of blocks on table is less than _small_table_threshold then data blocks will be placed at the other end of LRU and will be out of buffer cache more frequently. Now last scenario if Cache is 'Y' and and number of blocks in the table is more than _small_table_threshold then blocks wont be fetched in buffer cache but direct read from disk will be used. – Atif Jun 09 '20 at 08:12
  • In my case on both the DBs size of this table is quite big in few GBs so I think in both the DBs only 3rd scenario should be applicable which means Cache has no impact on FTS but at the same time disk read is 0. First I have asked DBAs to check the value of _small parameter then we can conclude on the above point. Also what you are saying will make more sense that due to other queries on 2nd DB most of the blocks are already in buffer cache and not due to Cache option on the table but when I compare number of blocks of both the DBs on buffer cache it is almost 90% most of the time. – Atif Jun 09 '20 at 08:22
  • This question is getting pretty tricky - you may want to modify the original question and put all of these details in one place. (Most people won't read through these comments.) Unfortunately the table threshold is undocumented, but I did find a better source from an Oracle employee [here](https://blogs.oracle.com/smartscan-deep-dive/when-bloggers-get-it-wrong-part-1). Apparently newer versions of Oracle also have a medium table threshold that defaults to 10% but depends on how "busy" the system is, which might make this question difficult. – Jon Heller Jun 10 '20 at 03:16
  • You may want to add details like the size of the table from DBA_SEGMENTS, make sure the statistics are up to date, some parameter values like the SGA size, the full execution plan, and maybe look at GV$ACTIVE_SESSION_HISTORY or use DBMS_SQLTUNE.REPORT_SQL_MONITOR to get a list of the exact wait events for both. Also, you may want to specify exactly what environment you're using - which version, edition, operating system, is it RAC, Exadata, etc. Normally that level of detail is unnecessary, but you have a weird issue here. – Jon Heller Jun 10 '20 at 03:19
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/215631/discussion-between-mohd-atif-and-jon-heller). – Atif Jun 10 '20 at 05:00