7

We have a stored proceudre in our database that updates a table by joining 2 tables on 30 columns with a where condition. The SQL Is in the general format of:

UPDATE Target
SET col1 = Source.col1
INNER JOIN Source 
on 
ISNULL(Target.Col2, '') = ISNULL(Source.Col2, '') and
ISNULL(Target.Col3, '') = ISNULL(Source.Col3, '') and
.
.
.
ISNULL(Target.Col31, '') = ISNULL(Source.Col31, '') and 

Here's the Query Plan. Save it to your PC and reopen it so it scales better.

enter image description here

The Source table has 65M records, the Target 165M. Previously it used to run in quite a few minutes. Considering how ugly and potentially inefficient the query is, I find this surprising. This month it ran for 1.5 hours, used 100% of the processor and we had to kill it.

Any suggestions how to improvise the below query and make it run on time..?

We have single column Indexes on a few of the columns used in the 30-col join condition.

I know the ISNULL function and the join on 30 columns is nuts and this is a bad design. Don't blame me, I inherited this economy.

Unfortunately there is no time for a re-design. Any suggestions?

Chad
  • 23,658
  • 51
  • 191
  • 321
  • 4
    What changed since last month? – Remus Rusanu May 02 '12 at 21:55
  • 2
    Update all NULLs to `''`, then run the query without the ISNULLs. But Remus' question is very pertinent, what *has* changed in that period? – MatBailie May 02 '12 at 21:56
  • 1
    Have you considered re-writing it as a merge? Have you considered making those columns non-nullable to eliminate 60 ISNULL calls? – Aaron Bertrand May 02 '12 at 21:56
  • 1
    You could check the query plan and try to modify the query to make sure the most efficient of the indexes is used – erikxiv May 02 '12 at 21:59
  • @erikxiv - Due to the ISNULLs I don't think the indexes will being used to maximum effect anyway. – MatBailie May 02 '12 at 22:01
  • What if the `ISNULL` is rewritten as `T.C = S.C OR (T.C IS NULL AND S.C IS NULL)` *or* if (icky, I know!) ANSI_NULL is turned off for the query (and then `ISNULL` can be dropped entirely). I would imagine the planner turns `ISNULL(...,c)` into something acceptable, but... it wouldn't hurt to see what the more explicit plan yields. A 5-minute test, in any case. –  May 02 '12 at 22:02

2 Answers2

5
  1. Please post a screenshot of the estimated execution plan
  2. I suspect that previously the query used a hash join (which it should) but somehow cardinality estimation has it wrong now and you get a loop join. Slap a hash join hint on the query to see if it fixed this (INNER HASH JOIN). Once we have the exact plan we can say more.
  3. Change the equality to (A1 = A2 OR (A1 IS NULL AND A2 IS NULL)). SQL Server actually recognizes this pattern and converts it to an "exact equals without stupid null semantics" internally. You can seek indexes that way even with null values.

If this doesn't help, be sure to do step (3) and create a covering index on col2-col31 including col1. This will get you a merge join which is the most efficient plan possible in this case. It is really fast. Warning: This will double the on-disk size of the table and slow down updates.

usr
  • 168,620
  • 35
  • 240
  • 369
0

The DBa suggested that we follow the query analyers recommendation to add an index with all of the 30 columns, most were "included" columns. This allowed the query to complete. The next month when we ran, the same update SQL which normally runs in 1.5 hours did not complete in 24 hours. When we ran Update statistics, it finished in an hour.

Chad
  • 23,658
  • 51
  • 191
  • 321