0

I have a table like this

AID     B       C   
2471    D471    X
2471    D471    Y

2471    E471    X

2471    F471    Y

2472    D471    X
2472    D471    X

2473    E471    Y
2473    E471    Y

2474    F471    Y

I need to pick the AID which has either X or Y by grouping A and B, but it should not pick the records which have X as well as Y when we grouped by A and B

Expected Output

AID     B       C
2471    E471    X

2471    F471    Y

2472    D471    X
2472    D471    X

2473    E471    Y
2473    E471    Y

2474    F471    Y

If you have any questions about this query, please let me know.

Scripts to create a table and insert data

CREATE TABLE tablename
(
    [A] int
    , [B] varchar(10)
    , [C] varchar(5)
);

INSERT INTO tablename
    ([A], [B], [C])
VALUES
    (2471, 'D471', 'X'),
    (2471, 'D471', 'Y'),
    (2471, 'E471', 'X'),
    (2471, 'F471', 'Y'),
    (2472, 'D471', 'X'),
    (2472, 'D471', 'X'),
    (2473, 'E471', 'Y'),
    (2473, 'E471', 'Y'),
    (2474, 'F471', 'Y')
;
Suraj Kumar
  • 5,547
  • 8
  • 20
  • 42
Inc
  • 55
  • 4

2 Answers2

1

Assuming that the C column can only contain the values X and Y, you should be able to just check the distinct count:

WITH cte AS (
    SELECT A, B
    FROM tablename
    GROUP BY A, B
    HAVING COUNT(DISTINCT C) = 1
)

SELECT t1.A, t1.B, t1.C
FROM tablename t1
INNER JOIN cte t2
    ON t1.A = t2.A AND t1.B = t2.B;

screen capture of demo below

Demo

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Sorry, I forgot to mention that point.. not only X and Y, C column can have other values as well. I just tried your query by giving 'where c in ('X','Y')' inside the CTE as well as outside CTE, it is working fine.. thank you – Inc Mar 27 '20 at 05:44
1

You can use not exists. Assuming X and Y are the only values:

select t.*
from tablename t
where not exists (select 1
                  from tablename t2
                  where t2.a = t.a and
                        t2.b = t.b and
                        t2.c <> t.c
                 );

With an index on tablename(a, b, c), this is likely to have the best performance. In addition, it easily lets you fetch all the columns in the row.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786