0

I have part of sql code :

select   
    ,inst_date
    ,country
    ,count (distinct (case when (event_day - inst_date) <= 0 then id end)) as event_0
    ,count (distinct (case when (event_day - inst_date) <= 1 then id end)) as event_1
    ,count (distinct (case when (event_day - inst_date) <= 2 then id end)) as event_2
    ,count (distinct (case when (event_day - inst_date) <= 3 then id end)) as event_3
    ,count (distinct (case when (event_day - inst_date) <= 4 then id end)) as event_4
    ,count (distinct (case when (event_day - inst_date) <= 5 then id end)) as event_5
    ,count (distinct (case when (event_day - inst_date) <= 6 then id end)) as event_6
    ,count (distinct (case when (event_day - inst_date) <= 7 then id end)) as event_7
from t1
all left join 
(
    select distinct id
         , toDate(event_date) as event_day
         , 1 as event
    from events
) as events_
using id
where (event_day - inst_date) between 0 and 7
group by inst_date
    ,country ;

and I see my data :

inst_date country event_0 event_1 event_2 event_3 event_4 event_5 event_6 event_7
2020-12-01 us 10 11 11 12 13 14 14 14

but I want to collect this data like

inst_date country date_difference events_count
2020-12-01 us 0 10
2020-12-01 us 1 11
2020-12-01 us 2 11
2020-12-01 us 3 12
2020-12-01 us 4 13
2020-12-01 us 5 14
2020-12-01 us 6 14
2020-12-01 us 7 14

at t1 table - list of id's with country and inst date. At events table - date of event by id

mitrik_bnr
  • 27
  • 6
  • it looks like a simple cumulative sum? https://stackoverflow.com/search?q=%5Bclickhouse%5DarrayCumSum https://clickhouse.tech/docs/en/sql-reference/functions/array-functions/#arraycumsumfunc-arr1 https://clickhouse.tech/docs/en/sql-reference/functions/other-functions/#runningaccumulate – Denny Crane Dec 27 '20 at 16:26
  • Unfortunately no, if in CASE block was SUM, array cumsum can be used here, but i have to count distinct id's at current period – mitrik_bnr Dec 27 '20 at 16:39
  • 1
    i see, https://groups.google.com/g/clickhouse/c/BqE49KuJpw8/m/I9Ymh6axEAAJ – Denny Crane Dec 27 '20 at 20:12

1 Answers1

0

Use group by:

select inst_date, country, 
    (event_day - inst_date) as date_difference,
    count(*) as events_count
from t1
all left join (
    select distinct id, toDate(event_date) as event_day, 1 as event
    from events
) as events_ using id
where (event_day - inst_date) between 0 and 7
group by inst_date, country, (event_day - inst_date)
order by inst_date, country, date_difference;

I would strongly recommend prefixing all columns in the query with the table they belong to, so the query is unambiguous about the underlying data structures.

GMB
  • 216,147
  • 25
  • 84
  • 135