How to find rows that are not shared between two queries from the same source table

Question

I have a CTE that that gives me the result of 760 rows. And I have another SELECT statement that gives me 722 rows. I want to see which records exist in CTE that are not exists in SELECT statement. I'm using NOT EXISTS statement, but for some reason it doesnt give me any result. I also tried NOT IN - but the same...no records.

;WITH  Cte_Policies AS 
        (
        SELECT 
            PolicyNumber,
            ControlNo,
            EffectiveDate,
            ExpirationDate, 
            ProducerName,
            SUM(BOUND_Premium) as NetWrittenPremium
        FROM    CatalyticWindEQ 
        WHERE   EffectiveDate >= '05-01-2016' AND EffectiveDate <= EOMONTH(GETDATE())       
                AND LineName = 'Earthquake'  AND Underwriter <> 'Batcheller, Jerry' AND PolicyNumber IS NOT NULL
        GROUP BY 

                ProducerName,
                EffectiveDate
                ,ExpirationDate ,PolicyNumber, ControlNo
        )
SELECT PolicyNumber,
        ControlNo,
        YEAR(EffectiveDate) as PolicyEffectiveYear,
        MONTH(EffectiveDate) as PolicyEffectiveMonth,
        NetWrittenPremium,
        ProducerName as Producer


FROM 
    Cte_Policies 


    where 
    NOT EXISTS

    (
        SELECT  
                    PolicyNumber
        FROM        CatalyticWindEQ eq
        WHERE       EffectiveDate>='05-01-2016' AND EffectiveDate <= EOMONTH(GETDATE()) AND LineName = 'Earthquake' AND Underwriter <> 'Batcheller, Jerry'
                    AND PolicyNumber IS NOT NULL
                    and eq.PolicyNumber=Cte_Policies.PolicyNumber
        GROUP BY    PolicyNumber
    )

The result from CTE only with 760 rows looks like this:

And the result from SELECT statement that gives 722 rows looks like this:

I've done this"

; with CTE as 
(
        SELECT 
            PolicyNumber,
            ControlNo,
            EffectiveDate,
            ExpirationDate, 
            ProducerName,
            SUM(BOUND_Premium) as NetWrittenPremium
        FROM    CatalyticWindEQ 
        WHERE   EffectiveDate >= '05-01-2016' AND EffectiveDate <= EOMONTH(GETDATE())       
                AND LineName = 'Earthquake'  AND Underwriter <> 'Batcheller, Jerry' AND PolicyNumber IS NOT NULL
        GROUP BY 

                ProducerName,
                EffectiveDate
                ,ExpirationDate ,PolicyNumber, ControlNo
        )
SELECT  PolicyNumber,

        min(tag) as min_tag,
        max(tag) as max_tag


FROM 
        (
        SELECT PolicyNumber, 1 as tag FROM CTE
    UNION ALL 
        SELECT PolicyNumber, 2 as tag FROM CatalyticWindEQ
        ) U
GROUP BY PolicyNumber
HAVING COUNT(*)=1

And now I have 888 rows with min_tag = 2 and max_tag=2. Does it mean each policy number is duplicated in my source table?

Not too familiar with TSQL, but could it be the GROUP BY in your NOT EXISTS clause? Doesn't look like you're using any aggregate function in order for the GROUP BY to do anything. — Brandon, Oct 11 '16 at 22:39
Just commented it out and tried again - but problem remains. Thanks though — Serdia, Oct 11 '16 at 22:41
[`EXCEPT`](https://msdn.microsoft.com/en-us/library/ms188055.aspx)? — HABO, Oct 11 '16 at 23:43

score 0 · Answer 1 · answered Oct 11 '16 at 22:41

0

The basic strategy is to select both datasets (lists of PolicyNumber), collect them using union all, and find the unique items in the combination.

; with CTE as ( ... )  -- the CTE from above
Select PolicyNumber, min(tag) as min_tag, max(tag) as max_tag from (
    select PolicyNumber, 1 as tag from CTE
  union all
    select PolicyNumber, 2 as tag from CatalyticWindEQ  -- the source you're matching
) U
Group by PolicyNumber
Having count(*) =1  -- equivalently, having min(tag) = max(tag)

Rows having max(tag) = 1 are only in the CTE.

answered Oct 11 '16 at 22:41

cco

5,873
1
16
21

I did it. Now I have 888 rows with min_tag = 2 and max_tag = 2. So does it mean each policy number in table `CatalyticWindEQ` repeats twice? – Serdia Oct 11 '16 at 23:08
I added the result above – Serdia Oct 11 '16 at 23:09
If min_tag = 2, the row is not returned by the CTE. All the 2 means is that it's in the second part of the `union` statement. – cco Oct 11 '16 at 23:56

score 0 · Answer 2 · answered Oct 11 '16 at 23:12

0

you can use the full outer join to flag the differences, after putting your second query inside another cte, say cte2, you can try something like this:

select 
   a.PolicyNumber,
   a.ControlNo,
   a.YEAR(EffectiveDate) as PolicyEffectiveYear,
   a.MONTH(EffectiveDate) as PolicyEffectiveMonth,
   a.NetWrittenPremium,
   a.ProducerName as Producer,
   b.PolicyNumber
from Cte_Policies as a
full outer join cte2 as b ON b.PolicyNumber=a.PolicyNumber
where 
      a.PolicyNumber is null -- will show records NOT in cte.
   OR b.PolicyNumber is null -- Will show records NOT in cte2.

answered Oct 11 '16 at 23:12

Ahmed Saeed

831
7
12

It also gives me empty table. Does it mean they both have exact same PolicyNember's ?? – Serdia Oct 11 '16 at 23:23
getting no records, means all policy numbers are common in both sides of the query. but since you have different counts for both queries, cte and cte2, and you are doing grouping, I guess you have some policyNumber duplicate. you need to double check. – Ahmed Saeed Oct 11 '16 at 23:48

How to find rows that are not shared between two queries from the same source table

2 Answers2