2

I have a problem with a GROUP_CONCAT select which should also have the row numbering included similar to this question GROUP_CONCAT numbering the difference is that i have to group by multiple columns.

As an example I have 2 tables review and review_detail.
Schema (MySQL v5.5)

create table review (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `submission_id` int(11) NOT NULL,
   PRIMARY KEY (`id`)
);

create table review_detail (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `review_id` int(11),
  `category_id` int(11),
  `rating` varchar(100),
  PRIMARY KEY (`id`)
);

insert into review (`id`, `submission_id`) values (1, 1), (2, 1), (3, 2), (4, 3), (5,1), (6,3), (7,2), (8,3);

insert into review_detail (`review_id`, `category_id`, `rating`)
values 
(1, 1, ' submission 1.1 cat 1'), (1, 2, ' submission 1.1 cat 2'),
(2, 1, ' submission 1.2 cat 1'), (2, 2, ' submission 1.2 cat 2'),
(3, 1, ' submission 2.1 cat 1'), (3, 2, ' submission 2.1 cat 2'),
(4, 1, ' submission 3.1 cat 1'), (4, 2, ' submission 3.1 cat 1'),
(5, 1, ' submission 1.3 cat 1'), (5, 2, ' submission 1.3 cat 2'),
(6, 1, ' submission 3.2 cat 1'), (6, 2, ' submission 3.2 cat 2'),
(7, 1, ' submission 2.2 cat 1'), (7, 2, ' submission 2.2 cat 2'),
(8, 1, ' submission 3.3 cat 1'), (6, 2, ' submission 3.3 cat 2')
;

Query #1

SELECT * FROM review;

| id  | submission_id |
| --- | ------------- |
| 1   | 1             |
| 2   | 1             |
| 3   | 2             |
| 4   | 3             |
| 5   | 1             |
| 6   | 3             |
| 7   | 2             |
| 8   | 3             |

Query #2

SELECT * FROM review_detail;

| id  | review_id | category_id | rating                |
| --- | --------- | ----------- | --------------------- |
| 1   | 1         | 1           |  submission 1.1 cat 1 |
| 2   | 1         | 2           |  submission 1.1 cat 2 |
| 3   | 2         | 1           |  submission 1.2 cat 1 |
| 4   | 2         | 2           |  submission 1.2 cat 2 |
| 5   | 3         | 1           |  submission 2.1 cat 1 |
| 6   | 3         | 2           |  submission 2.1 cat 2 |
| 7   | 4         | 1           |  submission 3.1 cat 1 |
| 8   | 4         | 2           |  submission 3.1 cat 1 |
| 9   | 5         | 1           |  submission 1.3 cat 1 |
| 10  | 5         | 2           |  submission 1.3 cat 2 |
| 11  | 6         | 1           |  submission 3.2 cat 1 |
| 12  | 6         | 2           |  submission 3.2 cat 2 |
| 13  | 7         | 1           |  submission 2.2 cat 1 |
| 14  | 7         | 2           |  submission 2.2 cat 2 |
| 15  | 8         | 1           |  submission 3.3 cat 1 |
| 16  | 6         | 2           |  submission 3.3 cat 2 |

Every review for a submission (foreign key = submission_id) have multiple review_detail entries with category_id (in my example only 2 categories (1,2) which are not relevant for the query).

I have to create a select where i get the GROUP_CONCAT grouped by submission_id and category_id.

The Concat string should return
Reviewer 1: {rating}, Reviewer 2: {rating}, Reviewer 3: {rating} etc..

e.g. for submission_id = 1 and category_id = 1 the group concat should return
Reviewer 1: submission 1.1 cat 1, Reviewer 2: submission 1.2 cat 1, Reviewer 3: submission 1.3 cat 1.

But i couldn't get the numbering in the group concat correct.

I have done multiple tests so far.

Group with only one column counter (works):
https://www.db-fiddle.com/f/6hA4Vft1mQGdw2Pew2An2T/3
Reviewer 1: submission 1.1 cat 1 of review 1 / Reviewer 2: submission 3.3 cat 1 of review 8 / Reviewer 3: submission 2.2 cat 1 of review 7 / Reviewer 4: submission 3.2 cat 1 of review 6 / ... etc.

SELECT
    --review.submission_id,
    review_detail.category_id,
    @i,
    GROUP_CONCAT(
        CONCAT(
            'Reviewer ',
            @i := @i + 1,
            ': ',
            rating,
            ' of review ',  review_id
        )
    SEPARATOR ' / '
    ) concatText,
    @i := 0
FROM
    review_detail
LEFT JOIN review ON review.id = review_detail.review_id,
    (
SELECT
    @i := 0
) init
GROUP BY
    review_detail.category_id
ORDER BY
    review_detail.category_id ASC
;

Test with if and a compare against a string of the 2 grouped columns (doesn't work):
https://www.db-fiddle.com/f/3woAVSw5hrav15jAmuWVdT/3
Reviewer 1: submission 1.1 cat 1 of review 1 / Reviewer 1: submission 1.2 cat 1 of review 2 / Reviewer 1: submission 1.3 cat 1 of review 5

SELECT
    submission_id,
    category_id,
    @i,
    @grp,
    CONCAT_WS("-", submission_id, category_id) AS catgroup,
    GROUP_CONCAT(
        CONCAT(
            'Reviewer ',
            @i := IF(
                @grp = CONCAT_WS("-", submission_id, category_id),
                @i + 1,
                IF(
                    @grp := CONCAT_WS("-", submission_id, category_id),
                    1,
                    1
                )
            ),
            ': ',
            rating,
            ' of review ',  review_id
        )
    ORDER BY review_id, submission_id, category_id 
    SEPARATOR ' / '
    ) concatText
FROM
    review_detail
LEFT JOIN review ON review.id = review_detail.review_id,
    (
SELECT
    @i := 0,
    @grp := ''
) init
GROUP BY
    review.submission_id,
    review_detail.category_id

So does anyone know a way to get the numbering in a GROUP_CONCAT call correct when multiple columns are grouped by?

BHoft
  • 1,663
  • 11
  • 18
  • Upgrade to 8.0 or MariaDB 10.2 so you can get `ROW_NUMBER()`. – Rick James Oct 16 '20 at 21:42
  • thank you all for your solutions. Every solution mentioned below works therefore its hard for me to give the bounty to a specific solution. I hope it doesn't annoy you if i picked some other solution. I am really appreciate all the solutions from below. – BHoft Oct 20 '20 at 07:03

4 Answers4

1

You should avoid using user-defined variables like that in production code.

In the manual for MySQL 5.6 it says:

As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement.

And even in the documentation for 8.0 it states:

The order of evaluation for expressions involving user variables is undefined. For example, there is no guarantee that SELECT @a, @a:=@a+1 evaluates @a first and then performs the assignment.

In future releases this might not work anymore altogether:

Previous releases of MySQL made it possible to assign a value to a user variable in statements other than SET. This functionality is supported in MySQL 8.0 for backward compatibility but is subject to removal in a future release of MySQL.

So here's a solution without user-defined variables:

SELECT 
r.submission_id,
rd.category_id,
GROUP_CONCAT(CONCAT('Reviewer ', (SELECT COUNT(*) + 1 
                                  FROM review 
                                  JOIN review_detail ON review.id = review_detail.review_id 
                                  WHERE r.submission_id = review.submission_id 
                                  AND review_detail.category_id = rd.category_id 
                                  AND review_detail.id < rd.id
                                 ), ': ', rating, ' of review ', review_id) ORDER BY rating SEPARATOR ' / ') AS shorter_column_name
FROM 
review r 
JOIN review_detail rd ON rd.review_id = r.id
GROUP BY r.submission_id, rd.category_id;

which returns

+---------------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| submission_id | category_id | shorter_column_name                                                                                                                           |
+---------------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
|             1 |           1 | Reviewer 1:  submission 1.1 cat 1 of review 1 / Reviewer 2:  submission 1.2 cat 1 of review 2 / Reviewer 3:  submission 1.3 cat 1 of review 5 |
|             1 |           2 | Reviewer 1:  submission 1.1 cat 2 of review 1 / Reviewer 2:  submission 1.2 cat 2 of review 2 / Reviewer 3:  submission 1.3 cat 2 of review 5 |
|             2 |           1 | Reviewer 1:  submission 2.1 cat 1 of review 3 / Reviewer 2:  submission 2.2 cat 1 of review 7                                                 |
|             2 |           2 | Reviewer 1:  submission 2.1 cat 2 of review 3 / Reviewer 2:  submission 2.2 cat 2 of review 7                                                 |
|             3 |           1 | Reviewer 1:  submission 3.1 cat 1 of review 4 / Reviewer 2:  submission 3.2 cat 1 of review 6 / Reviewer 3:  submission 3.3 cat 1 of review 8 |
|             3 |           2 | Reviewer 1:  submission 3.1 cat 1 of review 4 / Reviewer 2:  submission 3.2 cat 2 of review 6 / Reviewer 3:  submission 3.3 cat 2 of review 6 |
+---------------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
fancyPants
  • 50,732
  • 33
  • 89
  • 96
  • By the way, in production code I wouldn't use variables anyway. They are not 100% safe to use. The manual states that you shouldn't set and read variables in the same statement. – fancyPants Oct 16 '20 at 13:28
  • this is no bug, https://www.db-fiddle.com/f/3woAVSw5hrav15jAmuWVdT/4 the problem is the sorting, izt ahs ti be done as subquery – nbk Oct 16 '20 at 16:18
  • Well, in newer MySQL versions ORDER BYs in subqueries are optimized away, as they are unnecessary. – fancyPants Oct 16 '20 at 20:45
  • no read my answer, there is all explained, how mysql works with order By and subquery., and you cansee in the example i posted how you can brings it to live. – nbk Oct 16 '20 at 20:47
  • thank you for your solution. I am not sure if the manual really states that variables shouldn't been set and read in the same statement. It just says that the variables should be defined before they are used. But i like that your solution works without variables and because of this (and your answer was also the first) i accept your answer. – BHoft Oct 20 '20 at 08:12
  • @BHoft Thank you and you're welcome. I updated my answer to include several quotes from the manual, why you shouldn't use user-defined variables in queries like that. – fancyPants Oct 20 '20 at 11:14
1

to fix your query.

The base problem is that tables are by nature unsorted, that is why the MySQL optimizer,removes the ORDER BY.

In MySQL is it enough to put all the tables in the FROM clause ad make a subquery with the order, mysql will kepp it.

In Mariadb this is nt enough You have also add a LIMIT 18446744073709551615 so that the optimizer will keep it

Schema (MySQL v5.5)

Query #1

SELECT
    submission_id,
    category_id,
    @i,
    @grp,
    CONCAT_WS("-", submission_id, category_id) AS catgroup,
    GROUP_CONCAT(
        CONCAT(
            'Reviewer ',
            @i := IF(
                @grp = CONCAT_WS("-", submission_id, category_id),
                @i := @i + 1,
                IF(
                    @grp := CONCAT_WS("-", submission_id, category_id),
                    1,
                    1
                )
            ),
            ': ',
            rating,
            ' of review ',  review_id
        )
    ORDER BY review_id, submission_id, category_id 
    SEPARATOR ' / '
    ) concatText
FROM
    (SELECT review_id, submission_id, category_id,`rating` FROM review_detail
LEFT JOIN review ON review.id = review_detail.review_id
     ORDER BY review_id, submission_id, category_id ) t1,
    (
SELECT
    @i := 0,
    @grp := ''
) init


GROUP BY
    submission_id,
    category_id;

Result

| submission_id | category_id | @i  | @grp | catgroup | concatText                                                                                                                                    |
| ------------- | ----------- | --- | ---- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
| 1             | 1           | 0   |      | 1-1      | Reviewer 3:  submission 1.1 cat 1 of review 1 / Reviewer 2:  submission 1.2 cat 1 of review 2 / Reviewer 1:  submission 1.3 cat 1 of review 5 |
| 1             | 2           | 3   | 1-1  | 1-2      | Reviewer 3:  submission 1.1 cat 2 of review 1 / Reviewer 2:  submission 1.2 cat 2 of review 2 / Reviewer 1:  submission 1.3 cat 2 of review 5 |
| 2             | 1           | 3   | 1-2  | 2-1      | Reviewer 1:  submission 2.1 cat 1 of review 3 / Reviewer 2:  submission 2.2 cat 1 of review 7                                                 |
| 2             | 2           | 2   | 2-1  | 2-2      | Reviewer 2:  submission 2.1 cat 2 of review 3 / Reviewer 1:  submission 2.2 cat 2 of review 7                                                 |
| 3             | 1           | 2   | 2-2  | 3-1      | Reviewer 2:  submission 3.1 cat 1 of review 4 / Reviewer 1:  submission 3.2 cat 1 of review 6 / Reviewer 3:  submission 3.3 cat 1 of review 8 |
| 3             | 2           | 3   | 3-1  | 3-2      | Reviewer 3:  submission 3.1 cat 1 of review 4 / Reviewer 2:  submission 3.3 cat 2 of review 6 / Reviewer 1:  submission 3.2 cat 2 of review 6 |

View on DB Fiddle

nbk
  • 45,398
  • 8
  • 30
  • 47
  • thank you for your solution. I already have thought that the "missing" order of the table data is the cause of this. I have checked your fiddle but if gives different results for mysql versions 5.5+5.7 vs 5.6 + 8. The numbering is also reverse in 5.5, 5.7 Reviewer 3,2,1 instead of 1,2,3. But due that the result is different in various mysql versions i wouldn't use your solution. – BHoft Oct 20 '20 at 07:34
1

You need to use tow-step subquery to sort by reviewer number.

SET @i := 0;
SET @grp := '';
SELECT
    submission_id,
    category_id,
    GROUP_CONCAT(
      CONCAT(
        'Reviewer ',
        i,
        ': ',
        rating,
        ' of review ',  review_id
      )
      ORDER BY i
      SEPARATOR ' / '
    ) concatText
FROM
-- second, add numbering
(
  SELECT *,
    @i := IF(
      @grp = @grp := CONCAT_WS('-',submission_id,category_id),
      @i + 1, 1) i
  FROM
  -- first, sort for numbering
  (
    SELECT
        review_id,
        submission_id,
        category_id,
        rating
    FROM review_detail LEFT JOIN review ON review.id = review_detail.review_id
    ORDER BY
        submission_id,
        category_id,
        review_id
  ) t1
) t2
GROUP BY
    submission_id,
    category_id
;

db fiddle

etsuhisa
  • 1,698
  • 1
  • 5
  • 7
  • thank you for your solution. your fiddle does exactly what was requested and works in all mysql versions. – BHoft Oct 20 '20 at 07:38
  • But i have accepted the other answer because it isn't using variables. For which the assignment could be changed in future MySQL versions if i understand the manual correctly. i hope you can understand my decision. from the mysql 5,7 manual: "it is also possible to assign a value to a user variable in statements other than SET. (This functionality is deprecated in MySQL 8.0 and subject to removal in a subsequent release.)" – BHoft Oct 20 '20 at 08:16
0

For completeness I also add the solution how this could be done in Mysql 8.0

It works both with COUNT(*)

with base as (
    
  SELECT
    review_id,
    submission_id,
    category_id,
    rating,
    count(*) over (partition by submission_id,category_id  order by review_id) num
  
    FROM review_detail LEFT JOIN review ON review.id = review_detail.review_id
    ORDER BY
        submission_id,
        category_id,
        review_id
)
select   
  submission_id,
         category_id,
         group_concat(concat('Reviewer', num, ': ', rating, ' of review ',  review_id ) separator ', ') concattext
from     base
group by 
submission_id,
category_id
;

OR ROW_NUMBER()

with base as (
        SELECT
            review_id,
            submission_id,
            category_id,
            rating,
            ROW_NUMBER() over (partition by submission_id,category_id  order by review_id) num
        FROM review_detail 
        LEFT JOIN review ON review.id = review_detail.review_id
        ORDER BY
            submission_id,
            category_id,
            review_id
    )
    SELECT   
        submission_id,
        category_id,
        group_concat(concat('Reviewer', num, ': ', rating, ' of review ',  review_id  ) separator ', ') concattext
    from base
    group by 
        submission_id,
        category_id
;

DB Fiddle

BHoft
  • 1,663
  • 11
  • 18