19

Given a table of "events" where each event may be associated with zero or more "speakers" and zero or more "terms", those records associated with the events through join tables, I need to produce a table of all events with a column in each row which represents the list of "speaker_names" and "term_names" associated with each event.

However, when I run my query, I have duplication in the speaker_names and term_names values, since the join tables produce a row per association for each of the speakers and terms of the events:

1|Soccer|Bobby|Ball
2|Baseball|Bobby - Bobby - Bobby|Ball - Bat - Helmets
3|Football|Bobby - Jane - Bobby - Jane|Ball - Ball - Helmets - Helmets

The group_concat aggregate function has the ability to use 'distinct', which removes the duplication, though sadly it does not support that alongside the custom separator, which I really need. I am left with these results:

1|Soccer|Bobby|Ball
2|Baseball|Bobby|Ball,Bat,Helmets
3|Football|Bobby,Jane|Ball,Helmets

My question is this: Is there a way I can form the query or change the data structures in order to get my desired results?

Keep in mind this is a sqlite3 query I need, and I cannot add custom C aggregate functions, as this is for an Android deployment.

I have created a gist which makes it easy for you to test a possible solution: https://gist.github.com/4072840

murb
  • 1,736
  • 21
  • 31
aiwilliams
  • 193
  • 1
  • 1
  • 5

6 Answers6

10

Look up the speaker/term names independently from each other:

SELECT _id,
       name,
       (SELECT GROUP_CONCAT(name, ';')
        FROM events_speakers
        JOIN speakers
          ON events_speakers.speaker_id = speakers._id
        WHERE events_speakers.event_id = events._id
       ) AS speaker_names,
       (SELECT GROUP_CONCAT(name, ';')
        FROM events_terms
        JOIN terms
          ON events_terms.term_id = terms._id
        WHERE events_terms.event_id = events._id
       ) AS term_names
FROM events
CL.
  • 173,858
  • 17
  • 217
  • 259
7

I ran accross this problem as well, but came up with a method that I found a bit easier to comprehend. Since SQLite reports SQLite3::SQLException: DISTINCT aggregates must have exactly one argument, the problem seems not so much related to the GROUP_CONCAT method, but with using DISTINCT within GROUP_CONCAT...

When you encapsulate the DISTINCT 'subquery' within a REPLACE method that actually does nothing you can have the relative simplicity of nawfal's suggestion without the drawback of only being able to concat comma-less strings properly.

SELECT events._id, events.name, 
       (group_concat(replace(distinct speakers.name),'',''), ' - ') AS speaker_names, 
       (group_concat(replace(distinct speakers.name),'',''), ' - ') AS term_names 
FROM events 
LEFT JOIN 
   (SELECT et.event_id, ts.name 
    FROM terms ts 
    JOIN events_terms et ON ts._id = et.term_id
   ) terms ON events._id = terms.event_id 
LEFT JOIN 
   (SELECT sp._id, es.event_id, sp.name 
    FROM speakers sp 
    JOIN events_speakers es ON sp._id = es.speaker_id
   ) speakers ON events._id = speakers.event_id 
GROUP BY events._id;

But actually I would consider this a SQLite bug / inconsistency, or am I missing something?

murb
  • 1,736
  • 21
  • 31
  • Thank you for taking the time to write this up! It's been a long while for me since working on this problem, but after a few minutes review, I would have to agree that it seems like an inconsistency; I would expect the group_concat to support the custom separator with distinct. I don't find either solution hard to comprehend (once revealed!), and would prefer the one the does not require the invocation of a function that does nothing. – aiwilliams May 10 '14 at 16:43
  • This doesn't work. As Mi-La's edit in their answer also admits. The distinct keyword no longer has the intended effect. – Matt Chambers Jul 25 '17 at 18:22
  • The distinct keyword does have its intended effect for me (sqlite within an ios app) so this solutions worked well for me, thanks! – gpkamp Jan 08 '20 at 00:12
6

That's strange that SQLite doesnt support that!.

At the risk of being down voted, only if it helps:

You can avail Replace(X, Y, Z). But you have to be sure you wont have valid , values in your columns..

SELECT events._id, events.name, 
       REPLACE(group_concat(distinct speakers.name), ',', ' - ') AS speaker_names, 
       REPLACE(group_concat(distinct terms.name), ',', ' - ') AS term_names 
FROM events 
LEFT JOIN 
   (SELECT et.event_id, ts.name 
    FROM terms ts 
    JOIN events_terms et ON ts._id = et.term_id
   ) terms ON events._id = terms.event_id 
LEFT JOIN 
   (SELECT sp._id, es.event_id, sp.name 
    FROM speakers sp 
    JOIN events_speakers es ON sp._id = es.speaker_id
   ) speakers ON events._id = speakers.event_id 
GROUP BY events._id;
Community
  • 1
  • 1
nawfal
  • 70,104
  • 56
  • 326
  • 368
3

The problem arises only with the group_concat(X,Y) expression, not with the group_concat(X) expression.

group_concat(distinct X) works well.

So, if the ',' is good for you, there is no problem, but if you want a ';' instead of ',' (and you are sure no ',' is in your original text) you can do:

replace(group_concat(distinct X), ',', ';')

hussic
  • 1,816
  • 9
  • 10
1

Just to put a proper workaround (murb's answer is strangely parenthesized).

problem:

group_concat(distinct column_name, 'custom_separator') takes custom_separator as a part of distinct.

solution:

We need some no-op to let SQLite know that distinct finished (to wrap distinct and it's arguments). No-op can be replace with empty string as a second parameter (documentation to replace).

group_concat(replace(distinct column_name, '', ''), 'custom_separator')

edit:

just found that it does not work :-( - can be called but distinct is not working anymore

Mi-La
  • 685
  • 10
  • 18
1

There is a special case that does not work in sqlite : group_concat(DISTINCT X, Y) Whereas in SQL you can use group_concat(DISTINCT X SEPARATOR Y) in sqlite you can't

This example : Select group_concat(DISTINCT column1, '|') from example_table group by column2; gives the result : DISTINCT aggregates must have exactly one argument At line 1:

The solution :

select rtrim(replace(group_concat(DISTINCT column1||'@!'), '@!,', '|'),'@!') from example_table
Makina
  • 11
  • 1