Filtering unique values

Question

Consider the following table:

Column_A	Column_B	Column_C
1	UserA	NULL
2	UserB	NULL
3	UserC	1
4	UserA	1
5	UserB	NULL
6	UserB	2
7	UserC	2

I'd like to return all rows (Column_A, Column_B, Column_C) such that either:

Column_C is NULL, or
for every unique value in Column_C, return the first row with Column_B == UserA. If no such row exists, return the first row sorted by Column_B.time_created.

Column_A	Column_B (FK)	Column_C
1	UserA	NULL
2	UserB	NULL
4	UserA	1
5	UserB	NULL
6	UserB	2

Not sure how to do this in a single query.

select 
    Column_A, 
    Column_B, 
    Column_C, 
    min(case when Column_B = UserA then 0 else 1 end) as custom_order 
from Table 
where Column_B in (UserA, UserB, UserC) 
group by Column_C, Column_A 
order by Column_C nulls first, custom_order;

The min clause doesn't de-dupe as expected.

[Edit] the question and **only** tag the DBMS you're really using. Postgres and MySQL are very different in their capabilities, so I removed the incompatible tags. Also tag the specific version, it make make a difference too, especially in the case you're using MySQL. — sticky bit, Nov 18 '21 at 20:40
What column defines the order in a numerical, timestamp-/date-ish or lexicographical way? (Words like "first" etc. aren't really good for that...) — sticky bit, Nov 18 '21 at 20:44
what is column_B.Time_Created? It's not in your table columns defined... — xQbert, Nov 18 '21 at 20:45
Column_B is an FK column to a user table. Each user row has a time_created field that should be used for choosing the row to return when Column_B != UserA, — polestar, Nov 18 '21 at 20:48

xQbert · Accepted Answer · 2021-11-19T19:41:39.890

0

Assuming you have analytic functions available...

I break this apart let's get all the null values 1st.. Then union in the non nulls based on the order you want using an analytic (row_number), an embedded case expression, and two order bys.

SELECT Column_A, Column_B, Column_C, 0 as RN
FROM TABLE
WHERE COLUMN_C is null

UNION ALL

SELECT Column_A, Column_B, Column_C, RN
FROM (
  SELECT A.*, ROW_NUMBER() over (partition by A.column_C Order by case A.column_B when 'UserA' then 0 else 1 end, U.Time_Created) rn
  FROM Table A
  INNER JOIN user U
    on U.Column_B = A.Column_B
  WHERE A.Column_C is not null) B
WHERE RN = 1

the whole user join is based on a comment not the question. it assumes table joins to user on column_B

We use a case statement to handle ordering the 'UserA's first and then time created. so if you have two "userA"'s the first created will be first as well.

edited Nov 19 '21 at 19:41

answered Nov 18 '21 at 20:50

xQbert

34,733
2
41
62

I tried the second half of the query, but get an error: ERROR: syntax error at or near "order" LINE 1: ..., ROW_NUMBER() over (partition by A.Column_C, order by c... – polestar Nov 18 '21 at 21:43
Nvmd, no comma between partition by and order by. Thanks for your help, checking the query now. – polestar Nov 18 '21 at 21:59
@polestar glad it worked out. – xQbert Nov 19 '21 at 19:41

score 0 · Answer 2 · answered Nov 18 '21 at 22:17

Thanks xQbert for your help. Updated the answer to fix a few things:

No comma between partition and order
Filter by user_ids in each subquery - there may be users in the table we don't care about.

'''

SELECT Column_A, Column_B, Column_C, 0 as RN
FROM TABLE
WHERE COLUMN_C is null and Column_B in (UserA, UserB, UserC)

UNION ALL

SELECT Column_A, Column_B, Column_C, RN
FROM (
  SELECT A.*, ROW_NUMBER() over (partition by A.column_C Order by case A.column_B when 'UserA' then 0 else 1 end, U.Time_Created) rn
  FROM Table A
  INNER JOIN user U
    on U.Column_B = A.Column_B
  WHERE A.Column_C is not null and ColumnB in (userA, userB, UserC)) B
WHERE RN = 1

'''

Glad it all worked out for you. Remember; when eating an elephant, take one bite at a time. That's all I did here was break apart the pieces and put it back together. We could probably avoid the union; but as they are "Fast" I didn't see a reason not to break out the query into understandable segments. — xQbert, Nov 19 '21 at 19:44

Filtering unique values

2 Answers2

Linked