0

I need to join two tables in MariaDB in a specific way. For each project in table A I need to add one column for objective 1 and one column for objective 2. The value of objective 1 and objective 2 is the value of the field is_core associated with that objective in the table B in the highest phase as in the example.

In my real problem there will be more objectives but for simplicity I have used only 2 here.

Can anyone help with this? I am not very experienced with DB and my head is burning.

Table A

id name
1 project A
2 project B
3 project C

Table B

id project_id objective is_core phase
1 1 objective1 0 0
2 1 objective1 0 1
3 1 objective1 1 2
4 1 objective2 0 0
5 1 objective2 0 1
6 1 objective2 0 2
7 2 objective1 1 0
8 2 objective2 0 0
9 3 objective1 0 0
10 3 objective2 0 0

TableA.id links to TableB.projectId

Desired result:

project_id name objective1 objective2
1 project A 1 0
2 project B 1 0
3 project B 0 0

Many thanks!

user1191247
  • 10,808
  • 2
  • 22
  • 32
Jesus Paradinas
  • 189
  • 2
  • 12

3 Answers3

3

My apologies! I have just reread your question and I obviously completely missed the explanation of the values for the objective columns. The highest phase requirement turns this into a fairly typical groupwise-maximum problem. Here are three of the many possible solutions.

This first query is just using two correlated subqueries in the SELECT list:

SELECT a.id AS project_id, a.name,
    (
        SELECT is_core
        FROM tableB
        WHERE project_id = a.id
        AND objective = 'objective1'
        ORDER BY phase DESC
        LIMIT 1
    ) AS objective1,
    (
        SELECT is_core
        FROM tableB
        WHERE project_id = a.id
        AND objective = 'objective2'
        ORDER BY phase DESC
        LIMIT 1
    ) AS objective2
FROM tableA a;

This next query uses NOT EXISTS to make sure we have the latest phase and conditional aggregation:

SELECT b.project_id, a.name, 
    MAX(objective = 'objective1' AND is_core) AS objective1,
    MAX(objective = 'objective2' AND is_core) AS objective2
FROM tableA a
JOIN tableB b ON a.id = b.project_id
              AND NOT EXISTS (
                  SELECT 1 FROM tableB
                  WHERE project_id = b.project_id
                  AND objective = b.objective
                  AND phase > b.phase
              )
GROUP BY b.project_id;

This query checks the value is_core for the latest phase by using ROW_NUMBER() and conditional aggregation:

SELECT project_id, name,
    MAX(objective = 'objective1' AND is_core) AS objective1,
    MAX(objective = 'objective2' AND is_core) AS objective2
FROM (
    SELECT
        a.name, b.project_id, b.objective, b.is_core,
        ROW_NUMBER() OVER (PARTITION BY project_id, objective ORDER BY phase DESC) rn
    FROM tableA a
    JOIN tableB b ON a.id = b.project_id
) t
WHERE rn = 1
GROUP BY project_id;

All three of these queries will probably perform best with a full covering index on (project_id, objective, phase, is_core) but should at least have an index on (project_id, objective). YMMV!

Here's a fiddle

user1191247
  • 10,808
  • 2
  • 22
  • 32
1
SELECT *,
       EXISTS ( SELECT NULL
                FROM tableB 
                WHERE tableB.project_id = tableA.project_id
                   AND tableB.objective = 'objective1' ) objective1,
       EXISTS ( SELECT NULL
                FROM tableB 
                WHERE tableB.project_id = tableA.project_id
                   AND tableB.objective = 'objective2' ) objective2
FROM tableA;

If the amount of objectives (and hence the amount of final columns) is indefinite (i.e. you need in PIVOT) then use dynamic SQL.


our MariaDB version is 10.6.11-MariaDB-log – Jesus Paradinas

WITH cte AS (
  SELECT *, 
         ROW_NUMBER() OVER (PARTITION BY project_id, objective ORDER BY phase DESC) rn
  FROM B
)
SELECT A.id, A.name,
       MAX((objective = 'objective1') * (is_core)) objective1,
       MAX((objective = 'objective2') * (is_core)) objective2
FROM A
JOIN cte ON A.id = cte.project_id
WHERE rn = 1
GROUP BY cte.project_id

https://dbfiddle.uk/M6GMRIMH

PS. The same solution (which uses not CTE but subquery) was already provided by @nnichols..

Akina
  • 39,301
  • 5
  • 14
  • 25
  • Thanks for your answer but this is not working. It is not taking into acount the value isCore – Jesus Paradinas Mar 17 '23 at 15:32
  • 1
    @JesusParadinas Add `.. AND tableB.is_core` to each WHERE. – Akina Mar 17 '23 at 16:03
  • thanks Akina but we need to fetch the is_core value when that phase is MAX – Jesus Paradinas Mar 17 '23 at 16:31
  • 1
    @JesusParadinas What if maximal phase is present in 2 rows, one with is_core set and another with not set? – Akina Mar 17 '23 at 17:13
  • Akina thanks for your comments. The maximal phase can be present only once for each objective. – Jesus Paradinas Mar 18 '23 at 21:18
  • 1
    @JesusParadinas *The maximal phase can be present only once for each objective.* You cannot create a constraint which provides this restriction. How do you achieve the data integrity? PS. Anycase - what is your MariaDB version? – Akina Mar 19 '23 at 09:33
  • 1
    @nnichols This constraint does not provide only and definitely maximal phase to be set. It does not provide the same project/objective to have phase to be both set and dropped even. – Akina Mar 20 '23 at 04:40
  • our MariaDB version is 10.6.11-MariaDB-log – Jesus Paradinas Mar 20 '23 at 09:33
1

You can make two left join, one for objective1 and the other for objective2.

SELECT tableA.id AS project_id,
       tableA.name,
       MAX(tableB.is_core) AS "objective1",
       MAX(tableB2.is_core) AS "objective2"
FROM tableA
LEFT JOIN tableB ON tableB.project_id = tableA.id AND tableB.objective = "objective1"
LEFT JOIN tableB AS tableB2 ON tableB2.project_id = tableA.id AND tableB2.objective = "objective2"
GROUP BY tableA.id
  • looks good but the max() function should go with the "phase" column, and for that row where the phase is maximum take the is_core value for that objective. – Jesus Paradinas Mar 18 '23 at 21:17
  • 1
    I see, is the maximum value of phase column will always have maximum value of is_core column? or is there any possibilities that maximum value of phase column have not the maximum value of is_core value column? – Muhtarom Zain Mar 19 '23 at 05:15
  • we need the is_core value when the phase is maxium, for each objective type in a new column for each objective. – Jesus Paradinas Mar 20 '23 at 08:40