0

I am querying SQL Server 2012.

I have a left outer join onto a large result set, that looks like the following:

Data
LEFT OUTER JOIN
(
    SELECT
        MemberID
        ,MIN(Date)
    FROM
        Tables
    GROUP BY
        MemberID
) T
ON Data.MemberID = T.MemberID

Where Data is already a large query (tens of thousands of rows returned).

My question is: Is this nested select query evaluated only for the MemberIDs I already have in "Data"? This seems to have slowed the sproc down fairly significantly, is there a more standard way of doing this? Or is this just to be expected?

Thanks in advance.

MattMcGowan
  • 117
  • 1
  • 6

2 Answers2

1

No, the subquery is evaluated for all the rows.

If you want the first row for something, then do:

row_number() over (partition by Memberid order by Date) as seqnum

And then use:

where seqnum = 1
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • "*the subquery is evaluated for all the rows*" - are you sure? I would expect the `T` query to be executed once (because it is not a co-related query) and the join to be done on the result of that. The result might be kept in memory memory or written to disk depending on the number of rows, but still that would be done only once. –  Nov 04 '14 at 15:00
  • @a_horse_with_no_name . . . In every database that I have seen, an aggregation subquery is not optimized based on filters outside the subquery. I would *welcome* a database optimizer that performed otherwise. – Gordon Linoff Nov 04 '14 at 15:57
0

In the end I used Gordon Linoffs method of selecting the first row, and then changed the Left Outer Join to an Outer Apply with WHERE condition in the nested select statement of

Data.MemberID = Tables.MemberID

This increased the speed a lot. Note that the original question has a much simplified version of the query, so this solution may not be true in all cases.

MattMcGowan
  • 117
  • 1
  • 6