0

I have the following payments table

┌─name───────────────────────────┬─type────────────────────────────┐
│ payment_id                     │ UInt64                          │
│ factory                        │ String                          │
│ user_id                        │ UInt64                          │
│ amount_cents                   │ Int64                           │
│ action                         │ String                          │
│ success                        │ UInt8                           │
│ country                        │ FixedString(2)                  │
│ created_at                     │ DateTime                        │
│ finished_at                    │ Nullable(DateTime)              │
└────────────────────────────────┴─────────────────────────────────┘

With sample data

┌─factory───┬─────────finished_at─┬─payment_id─┬─country─┬─action──┬─amount_cents─┬─user_id───┬
│ 0_factory │ 2021-01-18 00:00:01 │          1 │ BY      │ payment │            1 │         1 │ 
│ 0_factory │ 2021-01-18 00:00:02 │          2 │ BY      │ payment │            1 │         1 │ 
│ 1_factory │ 2021-01-18 00:00:02 │          2 │ PL      │ win     │            4 │         1 │ 
│ 1_factory │ 2021-01-18 00:00:03 │          3 │ PL      │ win     │            7 │         1 │ 
│ 2_factory │ 2021-01-18 00:00:01 │          4 │ PL      │ win     │            7 │         1 │ 
│ 2_factory │ 2021-01-18 00:00:02 │          1 │ PL      │ payment │            7 │         1 │ 
│ 2_factory │ 2021-01-18 00:00:03 │          2 │ PL      │ win     │            7 │         1 │ 
│ 2_factory │ 2021-01-18 00:00:04 │          3 │ GR      │ win     │            2 │         1 │ 
└───────────┴─────────────────────┴────────────┴─────────┴─────────┴─────────┴────────────────┘

This is an example of what I have right now with

SELECT
       factory,
        user_id,
        payment_id,
        action,
        created_at
    FROM payments_all
    WHERE (payments_all.action = 'payment') AND (payments_all.factory IN ('0_factory', '1_factory', '2_factory')) AND isNotNull(payments_all.created_at)
    GROUP BY
        factory,
        user_id,
        payment_id,
        action
    HAVING (min(created_at) >= toDate('2019-01-01 00:00:00')) AND (min(created_at) < toDate('2021-10-01 00:00:00'))
    ORDER BY user_id

┌─factory───┬─user_id─┬─payment_id─┬─action──┬──────────created_at─┐
│ 1_factory │       1 │          1 │ payment │ 2021-02-04 09:00:00 │
│ 0_factory │       1 │          1 │ payment │ 2021-01-17 00:00:01 │
│ 0_factory │       1 │          2 │ payment │ 2021-01-17 00:00:06 │
└───────────┴─────────┴────────────┴─────────┴─────────────────────┘

I need to add new column first_payment

first_payment takes value 1 if action is payment && it is first payment for a user. Otherwise it takes value 0.

the first_payment should be checked for all period So expected result is:

┌─factory───┬─────────finished_at─┬─payment_id─┬─country─┬─action──┬─amount_cents─┬─user_id───┬first_payment─┐
│ 0_factory │ 2021-01-18 00:00:01 │          1 │ BY      │ deposit │            1 │         1 │           1  │
│ 0_factory │ 2021-01-18 00:00:02 │          2 │ BY      │ deposit │            1 │         1 │           0  │ 
│ 1_factory │ 2021-01-18 00:00:02 │          2 │ PL      │ win     │            4 │         1 │           0  │
│ 1_factory │ 2021-01-18 00:00:03 │          3 │ PL      │ win     │            7 │         1 │           0  │
│ 2_factory │ 2021-01-18 00:00:01 │          4 │ PL      │ win     │            7 │         1 │           0  │
│ 2_factory │ 2021-01-18 00:00:02 │          1 │ PL      │ deposit │            7 │         1 │           1  │
│ 2_factory │ 2021-01-18 00:00:03 │          2 │ PL      │ win     │            7 │         1 │           0  │
│ 2_factory │ 2021-01-18 00:00:04 │          3 │ GR      │ win     │            2 │         1 │           0  │
└───────────┴─────────────────────┴────────────┴─────────┴─────────┴─────────┴────────────────┘
dehelden
  • 39
  • 6

2 Answers2

0

As I can see for first payment the payment_id is always 1. So, I think you can use CASE WHEN payment_id=1 Then 1 ELSE 0 END AS first_payment. Please check query below =>

WITH CTE AS
(SELECT
       factory,
        user_id,
        payment_id,
        action,
        created_at
    FROM payments_all
    WHERE (payments_all.action = 'payment') AND (payments_all.factory IN ('0_factory', '1_factory', '2_factory')) AND isNotNull(payments_all.created_at)
    GROUP BY
        factory,
        user_id,
        payment_id,
        action
    HAVING (min(created_at) >= toDate('2019-01-01 00:00:00')) AND (min(created_at) < toDate('2021-10-01 00:00:00'))
) T1  

SELECT *,CASE WHEN payment_id=1 Then 1
         ELSE 0 END AS first_payment 
FROM CTE  
ORDER BY T1.user_id

NOTE: Query is written in SQL Server. Please check and let me know.

Srijon Chakraborty
  • 2,007
  • 2
  • 7
  • 20
0

I couldn't find much about ClickHouse, but it doesn't appear to support Windowed Functions.

Your example output also seems to be exactly the same as your sample table, plus one additional column, so I'm not sure what you GROUP BY was meant to achieve.

So, I'd use a LEFT JOIN on to a sub-query.

SELECT
  payments_all.*,
  CASE WHEN user_summary.user_id IS NOT NULL THEN 1 ELSE 0 END AS first_payment
FROM
  payments_all
LEFT JOIN
(
  SELECT
    user_id,
    factory,
    MIN(created_at)  AS first_created_at
  FROM
    payments_all
  WHERE
    action = 'payment'
  GROUP BY
    user_id,
    factory
)
  AS user_summary
    ON  payments_all.user_id    = user_summary.user_id
    ON  payments_all.factory    = user_summary.factory
    AND payments_all.created_at = user_summary.first_created_at
WHERE
     (payments_all.factory    IN ('0_factory', '1_factory', '2_factory'))
 AND (payments_all.created_at >= toDate('2019-01-01 00:00:00'))
 AND (payments_all.created_at <  toDate('2021-10-01 00:00:00'))
MatBailie
  • 83,401
  • 18
  • 103
  • 137