2

I have this SQL query:



SELECT 
    strftime('%Y-%m-%d', start_time) as day, group_concat(event_id, ' | ') as events
FROM
    events_table
WHERE 
    start_time BETWEEN '2022-01-01 00:00:00' and '2022-03-31 23:59:59'
    and event_id is not null
GROUP by day

This returns:


|-------------------------------------------------------------------------|
| day        |   events                                                   |
|-------------------------------------------------------------------------|
| 1999-01-04 |  event_1 | event_2 | event_3 | event_4                     |
| 1999-01-05 |  event_1 | event_1 | event_2                               |
| 1999-01-07 |  event_1 | event_2 | event_2 | event_3 | event_2 | event_3 |
|-------------------------------------------------------------------------|

Inside day events are ordered by start time. But I want to get:


|-------------------------------------------------------------------------|
| day        |   events                                                   |
|-------------------------------------------------------------------------|
| 1999-01-04 |  event_1 | event_2 | event_3 | event_4                     |
| 1999-01-05 |  event_1 | event_2                                         |
| 1999-01-07 |  event_1 | event_2 | event_3 | event_2 | event_3           |
|-------------------------------------------------------------------------|


As You can see, if consecutive values of column equal they should be joined into one, values delimited by | sign

In other words if two or more consecutive values of column are same they should be merged into one, so avoid repetition

forpas
  • 160,666
  • 10
  • 38
  • 76
Kiazim Khutaba
  • 191
  • 2
  • 14
  • 1
    Can you please share the sample data – Jens Feb 15 '23 at 08:51
  • Why are event_2 and event_3 for 1999-01-07 not distinct in your expected output? – forpas Feb 15 '23 at 09:12
  • Sample data file https://easyupload.io/df0r0p – Kiazim Khutaba Feb 15 '23 at 09:25
  • @forpas, I should only compare one value with another if this value near to another value, in the source table for 1999-01-07 day i have only two consecutive equal values – Kiazim Khutaba Feb 15 '23 at 09:32
  • Then what is the point of "I tried to use keyword DISTICNT but it simply deletes duplicates and doesn't preserve the order of values"? DISTINCT removes all duplicates not only the consecutive ones. – forpas Feb 15 '23 at 09:36
  • Also, in your sample data there are empty event_ids. Should they appear in your results? If yes, how? – forpas Feb 15 '23 at 09:37
  • @forpas, 1) i've remove mention of distinct, as it now irrelevant 2) my actual data contains null values, so i think this rows should be discarded Actually, I want to merge consecutive values if they same, that is, to avoid repetition – Kiazim Khutaba Feb 15 '23 at 11:20

1 Answers1

1

SQLite's GROUP_CONCAT() aggregate function does not support an ORDER BY clause and this means that even if you do get results in your expected order this order is not guaranteed.

Instead you should use GROUP_CONCAT() window function.

First, you must filter out the rows with empty event_ids and use LAG() window function to check if the previous event_id of the same date of each row (if it exists) is the same as the current one, so that it should be also filtered out:

WITH cte AS (
  SELECT *, event_id IS NOT LAG(event_id) OVER (PARTITION BY date(start_time) ORDER BY start_time) flag
  FROM events_table
  WHERE event_id <> ''
)
SELECT DISTINCT date(start_time) AS day, 
       GROUP_CONCAT(event_id, ' | ') OVER (
         PARTITION BY date(start_time) 
         ORDER BY start_time 
         ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) AS events
FROM cte
WHERE flag AND start_time BETWEEN '1999-01-01 00:00:00' AND '1999-03-31 23:59:59';

See the demo.

forpas
  • 160,666
  • 10
  • 38
  • 76