0

Consider the following table:

Column_A Column_B Column_C
1 UserA NULL
2 UserB NULL
3 UserC 1
4 UserA 1
5 UserB NULL
6 UserB 2
7 UserC 2

I'd like to return all rows (Column_A, Column_B, Column_C) such that either:

  1. Column_C is NULL, or
  2. for every unique value in Column_C, return the first row with Column_B == UserA. If no such row exists, return the first row sorted by Column_B.time_created.
Column_A Column_B (FK) Column_C
1 UserA NULL
2 UserB NULL
4 UserA 1
5 UserB NULL
6 UserB 2

Not sure how to do this in a single query.

select 
    Column_A, 
    Column_B, 
    Column_C, 
    min(case when Column_B = UserA then 0 else 1 end) as custom_order 
from Table 
where Column_B in (UserA, UserB, UserC) 
group by Column_C, Column_A 
order by Column_C nulls first, custom_order;

The min clause doesn't de-dupe as expected.

Shmiel
  • 1,201
  • 10
  • 25
polestar
  • 25
  • 3
  • [Edit] the question and **only** tag the DBMS you're really using. Postgres and MySQL are very different in their capabilities, so I removed the incompatible tags. Also tag the specific version, it make make a difference too, especially in the case you're using MySQL. – sticky bit Nov 18 '21 at 20:40
  • What column defines the order in a numerical, timestamp-/date-ish or lexicographical way? (Words like "first" etc. aren't really good for that...) – sticky bit Nov 18 '21 at 20:44
  • what is column_B.Time_Created? It's not in your table columns defined... – xQbert Nov 18 '21 at 20:45
  • Column_B is an FK column to a user table. Each user row has a time_created field that should be used for choosing the row to return when Column_B != UserA, – polestar Nov 18 '21 at 20:48

2 Answers2

0

Assuming you have analytic functions available...

I break this apart let's get all the null values 1st.. Then union in the non nulls based on the order you want using an analytic (row_number), an embedded case expression, and two order bys.

SELECT Column_A, Column_B, Column_C, 0 as RN
FROM TABLE
WHERE COLUMN_C is null

UNION ALL

SELECT Column_A, Column_B, Column_C, RN
FROM (
  SELECT A.*, ROW_NUMBER() over (partition by A.column_C Order by case A.column_B when 'UserA' then 0 else 1 end, U.Time_Created) rn
  FROM Table A
  INNER JOIN user U
    on U.Column_B = A.Column_B
  WHERE A.Column_C is not null) B
WHERE RN = 1

the whole user join is based on a comment not the question. it assumes table joins to user on column_B

We use a case statement to handle ordering the 'UserA's first and then time created. so if you have two "userA"'s the first created will be first as well.

xQbert
  • 34,733
  • 2
  • 41
  • 62
0

Thanks xQbert for your help. Updated the answer to fix a few things:

  1. No comma between partition and order
  2. Filter by user_ids in each subquery - there may be users in the table we don't care about.

'''

SELECT Column_A, Column_B, Column_C, 0 as RN
FROM TABLE
WHERE COLUMN_C is null and Column_B in (UserA, UserB, UserC)

UNION ALL

SELECT Column_A, Column_B, Column_C, RN
FROM (
  SELECT A.*, ROW_NUMBER() over (partition by A.column_C Order by case A.column_B when 'UserA' then 0 else 1 end, U.Time_Created) rn
  FROM Table A
  INNER JOIN user U
    on U.Column_B = A.Column_B
  WHERE A.Column_C is not null and ColumnB in (userA, userB, UserC)) B
WHERE RN = 1

'''

polestar
  • 25
  • 3
  • Glad it all worked out for you. Remember; when eating an elephant, take one bite at a time. That's all I did here was break apart the pieces and put it back together. We could probably avoid the union; but as they are "Fast" I didn't see a reason not to break out the query into understandable segments. – xQbert Nov 19 '21 at 19:44