-1

Hopefully this is an easy problem, and I'm just thinking about it wrong.

I have an association between id_1 and id_2. There is a 1->n relationship between id_1 and id_2. Namely, if no id_2 exists for id_1 it has a default value of -1. This can look like:

id_1   id_2   info
---- | ---- | ----
120   -1      'dont'
120   444     'show'
123   -1      'test'
124   -1      'hello'
125   -1      'world'
125   123     'oh wait'
126   -1      'help'
126   201     'me'
127   -1      'sql'
127   206     'hell'

Basically I want to have a SQL query that gets the highest id_2 associated with an id_1, given an IN statement matching criteria from id_1. If I can't have an IN statement that's also fine as well, I'm just trying to optimize something else.

Here is what the answer should look like given id_1 containing (123,124,125,126,127) and id_2 containing (-1,-1,123,201,206):

id_1   id_2   info
---- | ---- | ----
123    -1     'test'
124    -1     'hello'
125    123    'oh wait'
126    201    'me'
127    206    'hell'

Notice how id_1 120 ends up excluded. I've tried:

SELECT DISTINCT id_1, id_2, info
FROM ids
WHERE
      id_1 IN (123, 124, 125, 126, 127) AND
      id_2 IN (-1, -1, 123, 201, 206)

But this gives multiple id_2 values for a single id_1.

I've also tried a GROUP BY on id_2 but that only gives me one id_1 that has -1.

SELECT id_1, id_2, info
FROM ids
WHERE
      id_1 IN (123, 124, 125, 126, 127) AND
      id_2 IN (-1, -1, 123, 201, 206)
GROUP BY id_2

results in:

id_1   id_2    info
----- | ---- | ----
123    -1     'test'
125    123    'oh wait'
126    201    'me'
127    206    'hell'

Note the missing id_1 of 124.

So how can I write this query to get the most recent, distinct id_2 for an id_1?

Rick James
  • 135,179
  • 13
  • 127
  • 222
Alex
  • 2,145
  • 6
  • 36
  • 72
  • Your GROUP BY is invalid and expected to raise an error. (And will do so in newer MySQL versions, unless in compatibility mode.) – jarlh Jan 20 '22 at 10:17
  • Don't forget to _specify_ the expected result. – jarlh Jan 20 '22 at 10:18
  • @jarlh strange, I must be in compatibility mode o_o, also updated the question to show the expected answer, where only `id_1` is included based on criteria. – Alex Jan 20 '22 at 10:33
  • This is probably a good use for `NULL` -- instead of "-1". `MAX(id2)` gives the same answer either way. – Rick James Jan 20 '22 at 18:36
  • Was "120" deliberately left out? Why? Is that exclusion part of the Question? – Rick James Jan 20 '22 at 18:38

1 Answers1

2

You may user ROW_NUMBER if on MySQL 8+:

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY id_1 ORDER BY id_2 DESC) rn
    FROM ids
    WHERE id_1 IN (123, 124, 125, 126, 127) AND
          id_2 IN (-1, -1, 123, 201, 206)
)

SELECT id_1, id_2, info
FROM cte
WHERE rn = 1;

On earlier versions of MySQL, you can identify the max record per id_1 group using exists logic:

SELECT i1.*
FROM ids i1
WHERE id_1 IN (123, 124, 125, 126, 127) AND
      id_2 IN (-1, -1, 123, 201, 206) AND
      NOT EXISTS (
          SELECT 1
          FROM ids i2
          WHERE i2.id_1 = i1.id_1 AND
                i2.id_2 > i1.id_2 AND
                i2.id_1 IN (123, 124, 125, 126, 127) AND
                i2.id_2 IN (-1, -1, 123, 201, 206)
      );
Alex
  • 2,145
  • 6
  • 36
  • 72
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360