T-SQL Join with "max" filter

Question

I'm using SQL Server 2014.

I'd like the following to give me no duplicate phone ID's or numbers:

WITH Phones as
(
    SELECT * FROM (VALUES 
    (1,'602 600 8000'),
    (2,'602 600 8001'),
    (3,'602 600 8002')
    ) AS Vict_t (Id,Number)
), InvoicePhones as
(
    SELECT * FROM (VALUES
    (10, 1, 100, 'Alpha'),
    (11, 1, 101, 'Bravo'),
    (12, 1, 102, 'Charlie'),
    (13, 2, 103, 'Alpha'),
    (14, 2, 104, 'Bravo'),
    (15, 2, 105, 'Charlie'),
    (16, 3, 106, 'Alpha'),
    (17, 3, 107, 'Bravo'),
    (18, 3, 108, 'Charlie')
    ) as ip_t (Id,PhoneId,VoiceId, Name)
), Voices as
(
    SELECT * FROM (VALUES
     (100, '201701'),
     (101, '201702'),
     (102, '201703'),
     (103, '201704'),
     (104, '201705'),
     (105, '201706'),
     (106, '201708'),
     (107, '201709'),
     (108, '201710')
    ) AS Voices_t (Id,BillingCycle)
)

SELECT P.Id PhoneId, P.Number, IP.Name
FROM Phones P
LEFT JOIN InvoicePhones IP on IP.PhoneId = P.Id and IP.VoiceId =
    (
    select TOP 1 id
    from Voices V
    where V.Id = IP.VoiceId
    order by V.BillingCycle desc
    )

I cannot understand why the sub-select is not eliminating the duplicates.

What I'm receiving is this:

1   602 600 8000    Alpha
1   602 600 8000    Bravo
1   602 600 8000    Charlie
2   602 600 8001    Alpha
2   602 600 8001    Bravo
2   602 600 8001    Charlie
3   602 600 8002    Alpha
3   602 600 8002    Bravo
3   602 600 8002    Charlie

What I'm expecting is this:

1   602 600 8000    Charlie
2   602 600 8001    Charlie
3   602 600 8002    Charlie

This example uses simple integer ID's, but the real tables I'm working with are using uniqueidentifier. Thus the answer I need must take that into account.

I tried both versions of this accepted answer, but it doesn't work for me.

What am I missing?

Update

In addition to the answer I chose, I realized another way to solve this problem is as follows:

SELECT P.Id PhoneId, P.Number, IP.Name
FROM Phones P
LEFT JOIN InvoicePhones IP on IP.PhoneId = P.Id and IP.VoiceId =
    (
    select TOP 1 V.Id
    from Voices V
    INNER JOIN InvoicePhones IPS ON IPS.VoiceId = V.Id
    WHERE P.Id = IPS.PhoneId
    order by V.BillingCycle desc
    )

I'm curious if they can also be solved with an OUTER APPLY, as mentioned in this other SO post

Your sample data is not very good, because `select * from phones` is quite sufficient to get those results. — Gordon Linoff, May 17 '17 at 22:19
`I cannot understand why the sub-select is not eliminating the duplicates` Why would it? Your subselect is not much different from saying `and 1 = 1`, with the data you've given, so it wouldn't make a difference to your overall query. — ZLK, May 17 '17 at 22:21
@GordonLinoff I've added another field to the projection to make it more clear what the issue is. — Brent Arias, May 17 '17 at 22:27
Why the output is `Charlie`? Still isnt clear why you need the third table. — Juan Carlos Oropeza, May 17 '17 at 22:29
Right now your query is `IP.VoiceId = query`, but in the query you say `V.Id = IP.VoiceId` and you are selecting `V.id` as ZLK say, that is just `1 = 1` — Juan Carlos Oropeza, May 17 '17 at 22:33
@JuanCarlosOropeza The query is sorted, with a "top 1" projection...this means that the largest "BillingCycle" will be chosen and the others removed. When the others are removed, the `IP.VoiceId=query` should fail. — Brent Arias, May 17 '17 at 22:39

Juan Carlos Oropeza · Accepted Answer · 2017-05-17T22:43:23.093

0

Looks like you need use ROW_NUMBER() to get the latest ID, but the final logic or the need for the third table isnt clear.

SQL DEMO

), filter as (
    SELECT P.Id PhoneId, P.Number, IP.Name, IP.VoiceId,
           ROW_NUMBER() OVER (PARTITION BY P.Id ORDER BY VoiceID DESC) as rn
    FROM Phones P
    LEFT JOIN InvoicePhones IP on IP.PhoneId = P.Id 
)
SELECT *
FROM filter 
WHERE rn = 1

OUTPUT

To include the third table:

SQL DEMO

), filter as (
    SELECT P.Id PhoneId, P.Number, IP.Name, IP.VoiceId, V.*, 
           ROW_NUMBER() OVER (PARTITION BY P.Id ORDER BY BillingCycle DESC) as rn
    FROM Phones P
    LEFT JOIN InvoicePhones IP on IP.PhoneId = P.Id 
    LEFT JOIN Voices V on V.Id = IP.VoiceId
)
SELECT *
FROM filter 
WHERE rn = 1

OUTPUT

edited May 17 '17 at 22:43

answered May 17 '17 at 22:18

Juan Carlos Oropeza

47,252
12
78
118

I've updated my question to clarify why I must join all three tables. – Brent Arias May 17 '17 at 22:29
Check the EDIT version again – Juan Carlos Oropeza May 17 '17 at 22:44
I've updated my question to clarify that the *real* tables I'm working with have `uniqueidentifier` key types instead of the integer key values I've shown. Thus I need an answer that does not rely on ordering/sorting any of the keys; this is why the third table is needed. – Brent Arias May 17 '17 at 22:56

score 0 · Answer 2 · answered May 17 '17 at 22:18

I'm not sure what you want. But you have this condition:

IP.VoiceId = (select TOP 1 id
              from Voices V
              where V.Id = IP.VoiceId
              order by V.BillingCycle desc
             )

The correlation clause is V.id = IP.VoiceId. The equality comparison is -- essentially -- IP.VoiceId = V.id. They are the same. As long as there is at least one matching record in Voices, then the IP record passes the test. With your data, all IP records pass the test.

I'm not sure what you are really trying to accomplish. If you want only one row per phone, then I would be thinking EXISTS or IN.

The simplest way to get the results you want is:

select p.*
from phones p;

I suspect that you want more sophisticated logic, however.

score 0 · Answer 3 · answered May 18 '17 at 07:07

;WITH Phones as
(
    SELECT * FROM (VALUES 
    (1,'602 600 8000'),
    (2,'602 600 8001'),
    (3,'602 600 8002')
    ) AS Vict_t (Id,Number)
), InvoicePhones as
(
    SELECT * FROM (VALUES
    (10, 1, 100, 'Alpha'),
    (11, 1, 101, 'Bravo'),
    (12, 1, 102, 'Charlie'),
    (13, 2, 103, 'Alpha'),
    (14, 2, 104, 'Bravo'),
    (15, 2, 105, 'Charlie'),
    (16, 3, 106, 'Alpha'),
    (17, 3, 107, 'Bravo'),
    (18, 3, 108, 'Charlie')
    ) as ip_t (Id,PhoneId,VoiceId, Name)
), Voices as
(
    SELECT * FROM (VALUES
     (100, '201701'),
     (101, '201702'),
     (102, '201703'),
     (103, '201704'),
     (104, '201705'),
     (105, '201706'),
     (106, '201708'),
     (107, '201709'),
     (108, '201710')
    ) AS Voices_t (Id,BillingCycle)
)
,Expected
AS
(
SELECT P.Id PhoneId, P.Number, IP.Name
FROM Phones P
LEFT JOIN InvoicePhones IP on IP.PhoneId = P.Id and IP.VoiceId =
    (
    SELECT TOP 1 id
    FROM Voices V
    WHERE V.Id = IP.VoiceId
    ORDER BY V.BillingCycle desc
    )
    )
SELECT PhoneId,Number,Name From
(
SELECT *,ROW_NUMBER()OVER(PARTITION BY Phoneid,Number  ORDER BY Phoneid)AS Seq
FROM Expected
)DT
WHERE DT.Seq=3

OutPut

 PhoneId Number         Name
  ------------------------------
    1    602 600 8000    Charlie
    2    602 600 8001    Charlie
    3    602 600 8002    Charlie

T-SQL Join with "max" filter

Update

3 Answers3