2

In this scenario, all Contractors place an order for various Stock Items on a daily basis. This data is stored in the ContractorOrder table. When the Stock is received, the values are recorded and saved in the ContractorStock table. It is not possible for line items to appear in the ContractorStock table if they do not exist in the ContractorOrder table.

Sometimes however, some items are not delivered, or the incorrect quantities are delivered, or the Contractors forgot to place an order. To check this, I wrote these two queries (ignore the situation where the Sums are the same but individual line items differ - this is just a simplified example):

SELECT C.ContractorId
, C.ContractorName
, Sum(CO.Quantity) OrderQuantity
, Sum(CS.Quantity) ContractorQuantity
FROM [Contractor] C
LEFT JOIN ContractorOrder CO
  ON C.ContractorId = CO.ContractorId
  AND CO.Date = @date
LEFT JOIN
(
  ContractorStock CS
  JOIN StockCode SC
    ON CS.StockCodeId = SC.StockCodeId
    AND CS.Date = @date
)
  ON CS.ContractorId = C.ContractorId
  AND SC.StockCode = CO.StockCode
GROUP BY C.ContractorId
, C.ContractorName

http://sqlfiddle.com/#!6/55496/5

And:

SELECT C.ContractorId
, C.ContractorName
, Sum(CO.Quantity) OrderQuantity
, Sum(CS.Quantity) ContractorQuantity
FROM [Contractor] C
LEFT JOIN ContractorOrder CO
  ON C.ContractorId = CO.ContractorId
  AND CO.Date = @date
LEFT JOIN
(
  SELECT CS.ContractorId
  , SC.StockCode
  , CS.Quantity
  FROM ContractorStock CS
  JOIN StockCode SC
    ON CS.StockCodeId = SC.StockCodeId
    AND CS.Date = @date
) CS
  ON CS.ContractorId = C.ContractorId
  AND CS.StockCode = CO.StockCode
GROUP BY C.ContractorId
, C.ContractorName

http://sqlfiddle.com/#!6/55496/2

Both of them return the same results, but there is a slight difference in the Execution Plans. Are these queries not logically equivalent in this situation? If they are, should I prefer the Nested Join?

Nested Join:

enter image description here

Sub Query:

enter image description here

ilitirit
  • 16,016
  • 18
  • 72
  • 111

1 Answers1

0

What you just proved (when it comes different kinds of queries) is that you should let SQL Server figure out the best plan (as the SQL versions go up the engine gets better and better at finding the best query plan) and you should concentrate on writing SQL that produces correct results that are set based.

The only difference I see is a compute scaler step that is closer to 0% of the total plan so its not worth spending alot of time eliminating that step.

In the old days (pre-SQL Server 2000) it actually mattered if you wrote joins vs sub-queries. Now in the great majority of times it does not pay to change a sub-query to a join or vice versa.

You always want to avoid row by row operations and replace, where possible, scans with seeks but once you have done that move on.

benjamin moskovits
  • 5,261
  • 1
  • 12
  • 22