1

So, a few weeks ago, I asked about Oracle execution plan cost vs speed in relation to the FIRST_ROWS(n) hint. I've run into a similar issue, but this time around the ORDERED hint. When I use the hint, my execution time improves dramatically (upwards of 90%), but the EXPLAIN PLAN for the query reports an enormous cost increase. In this particular query, the cost goes from 1500 to 24000.

The query is paramterized for pagination, and joins 19 tables to get the data out. I'd post it here, but it is 585 lines long and is written for a vendor's messy, godawful schema. Unless you happened to be intimately familiar with the product this is used for, it wouldn't be much help to see it. However, I gathered the schema stats at 100% shortly before starting work on tuning the query, so the CBO is not working in the dark here.

I'll try to summarize what the query does. The query essentially returns objects and their children in the system, and is structured as a large subquery block joined directly to several tables. The first part returns object IDs and is paginated inside its query block, before the joins to other tables. Then, it is joined to several tables that contain child IDs.

I know that the CBO is not all knowing or infalible, but it really bothers me to see an execution plan this costly perform so well; it goes against a lot of what I've been taught. With the FIRST_ROWS hint, the solution was to provide a value n such that the optimizer could reliably generate the execution plan. Is there a similar kind of thing happening with the ORDERED hint for my query?

Community
  • 1
  • 1
monitorjbl
  • 4,280
  • 3
  • 36
  • 45
  • 1
    The query is really big and ugly. Plus, it's against a vendor's ugly, messy schema and it would take a long time to explain what's in each table. I added some additional info though, hopefully it'll help. – monitorjbl May 22 '13 at 15:38
  • Are the table stats up to date? –  May 22 '13 at 15:41
  • The stats are up to date, I ran them at 100% shortly before I started trying to tune the query. – monitorjbl May 22 '13 at 15:42
  • 1
    The estimated cost might increase for retrieving all rows, making that potentially slower than it would have been before (but again this is an estimate); perhaps it doesn't take the pagination into account in the cost, since different values would otherwise cause the same plan to vary quite a bit? Without seeing the query, this might be one of those that is fast for early pages and slows down later on (beyond reasonable use), so you're kind of seeing an average over the whole range of possible pages. Or, you're just reading too much into it *8-) – Alex Poole May 22 '13 at 15:56
  • Wow, I just tried replacing the pagination parameters with literals. The cost went down to a much more reasonable value. Still higher, so the CBO won't choose it on its own (duh), but a lot less worrisome. If you make that an answer, I'll accept it. – monitorjbl May 22 '13 at 16:06
  • @monitorjbl - my main point was the same one David Aldridge made in his answer (one second before my comment, apparently), so I don't think there's any point duplicating that. Vincent also made some good points, of course. – Alex Poole May 22 '13 at 16:16
  • 1
    The `ORDERED` hint merely means you are taking responsbility for the join order - it doesn't magically make your query faster, it just allows you to override the CBO in deciding which order the tables should be joined. You should expect that adding the ORDERED hint *will* make the execution plan cost higher - but that's irrelevant, as you have taken some of the responsibility for tuning the query on yourself, because you have a better understanding of the data patterns than the CBO can guess from the statistics. – Jeffrey Kemp May 23 '13 at 02:54

2 Answers2

2

You should not rely on the execution cost to optimize a query. What matters is the execution time (and in some cases resource usages).

From the concept guide:

The cost is an estimated value proportional to the expected resource use needed to execute the statement with a particular plan.

When the estimation is off, most often it is because the statistics available to the optimizer are misleading. You can correct that by giving the optimizer more accurate statistics. Check that the statistics are up to date. If they are, you can gather additional statistics, for example by enabling dynamic statistic gathering of manually creating an histogram on a data-skewed column.

Another factor that can explain the disparity between relative cost and execution time is that the optimizer is built upon simple assumptions. For example:

  • Without an histogram, every value in a column is uniformly distributed
  • An equality operator will select 5% of the rows (without histogram or dynamic stats)
  • The data in each column is independent upon the data in every other column
  • Furthermore, for queries with bind variables, a single cost is computed for further executions (even if the bind value change, possibly modifying the cardinality of the query)
  • ...

These assumptions are made so that the optimizer can return an execution cost that is a single figure (and not an interval). For most queries these approximation don't matter much and the result is good enough.

However, you may find that sometimes the situation is simply too complex for the optimizer and even gathering extra statistics doesn't help. In that case you'll have to manually optimize the query, either by adding hints yourself, by rewriting the query or by using Oracle tools (such as SQL profiles).

If Oracle could devise a way to accurately determine the execution cost, we would never need to optimize a query manually in the first place !

Vincent Malgrat
  • 66,725
  • 9
  • 119
  • 171
  • I understand that, and I certainly don't rely solely on execution cost. What bothers me here is the fact that the reported cost is 20 times higher, despite running much, much faster. – monitorjbl May 22 '13 at 15:46
  • I updated my answer, mostly the disparity comes from (1) inaccurate statistics and (2) unrealistic assumptions by the optimizer. I'll try to add relevant links. – Vincent Malgrat May 22 '13 at 16:02
  • While I agree in principle, in theory the cost is at least strongly related to the total time for the query to complete (resource time rather than wall clock time of course, particularly for parallel query). It is the time expressed in units of the amount of time for a single logical read. so if a single read takes 1ms then a cost of 1,000 means an estimated time of one second. Of course in practice things can be very different. – David Aldridge May 22 '13 at 17:19
2

The reported cost is for the execution of the complete query, not just the first set of rows. (PostgreSQL does the costing slightly differently, in that it provides the cost for the initial return of rows and for the complete set).

For some plans the majority of the cost is incurred prior to returning the first rows (eg where a sort-merge is used), and for others the initial cost is very low but the cost per row is relatively high thereafter (eg. nested loop join).

So if you are optimising for the return of the first few rows and joining 19 tables you may get a very low cost for the return of the first 20 with a nested loop-based plan. However for of the complete set of rows the cost of that plan might be very much higher than others that are optimised for returning all rows at the expense of a delay in returning the first.

David Aldridge
  • 51,479
  • 8
  • 68
  • 96
  • This is the issue I was running into. I parameterized the part of my query that paginated. It looks like the optimizer was assuming that the size of the page would include all results, and using the ORDERED hint would have been very expensive for that. Not even combining the FIRST_ROWS hint with this made the CBO believe that I wouldn't want all results. But once I replaced the parameters with literals, the cost dropped down to something more reasonable. – monitorjbl May 22 '13 at 18:12