8

I have a complex query on a database with bad statistics and fragmented indices. What I'm puzzled by is that when I examine an actual query plan I get 54 M rows from a table scan on a table that has 23 K rows. Much further up the query plan this table is joined against itself (only 260 K rows out of 23 K). How is this possible?

Running some other queries or rebuilding indices and statistics makes this go away, I'm just trying to understand why this would happen.

I have reproduced this with SQL 2005 and SQL 2008 R2 on a restore of the same database.

Update: Yes, this is an actual plan. Number of rows is 20039 (not 23 K as mentioned above). This is one the the rightmost nodes.

PavelR
  • 123
  • 6
  • Hmmm... is the table being scanned multiple times? – Will A May 26 '11 at 19:44
  • The table is self joined in the query for good reasons. In the query plan it is scanned twice (Table Scan mentioned above) and also three index seeks on the table. – PavelR May 26 '11 at 19:51
  • Is this estimated execution plan, or actual execution plan? – Ryk May 26 '11 at 23:14
  • Yes - actual plan, attached an image. – PavelR May 27 '11 at 00:07
  • 2
    @Pavel - Interesting that 2701 * 20039 gives **exactly** the right number. sure it isn't being executed 2701 times? Is it on the inner side of a join or something? – Martin Smith May 27 '11 at 01:34
  • One other thing that's not related to your question directly, but seeing the "Table Scan" operator in your query plan indicates that that table is a heap (otherwise you'd see a Clustered Index Scan). You really should investigate putting a clustered index on that table. – Ben Thul May 27 '11 at 12:17

1 Answers1

6

It looks as though this node in the execution plan is the "second" table involved in a nested loop join, with 2701 rows in the "first" table (thanks Martin!).

As there appears to be no appropriate index on the HistoricalPrice table, the heap must be scanned for every row in the loop join, resulting in a total of 2701*20039 = 54,125,339 rows. The number of rows coming out of the Nested Loop operator will be the total number of joined/matched rows.

While the execution plan only shows the table being accessed as one node, the loop join would end up accessing that table as many times as there are rows. Without an index, the entire table must be scanned, which returns 20,039 rows back to the Nested Loop operator each time.

If an appropriate index was placed on the table to support the join, then perhaps only a single row would be seeked, and thus a smaller number of rows sent back to the Nested Loop.

Jim McLeod
  • 952
  • 6
  • 8