0

I am trying to execute the following query with Clickhouse which joins 4 tables:

select
           s.branch_id AS branch_id,
           pa.sharer_id as sharer_id,
           si.commission_type_id as commission_type_id,
           bp.product_id as product_id,
           bc.client_id as client_id,
           SUM(si.amount_exclusive) AS production,
           SUM(si.premium)          AS premuim,
           SUM(si.fees)             AS fee,
           SUM(si.commission)       AS commission,
           'Y' as rp_indicator,
            date_trunc('month',s.statement_date) as statement_date
      FROM statement s
             JOIN statement_item si ON si.statement_id = s.id
             LEFT JOIN product_agreement pa ON pa.id = si.product_agreement_id
             JOIN branch_product bp ON bp.id = pa.branch_product_id
             join branch_client bc on bc.id = bp.branch_client_id
      GROUP BY
           date_trunc('month',s.statement_date) ,
           s.branch_id,
           pa.sharer_id,
           si.commission_type_id,
           bp.product_id,
           bc.client_id

However, that doesn't work due to being out of memory:

Code: 241. DB::Exception: Memory limit (total) exceeded: would use 9.44 GiB (attempt to allocate chunk of 67108864 bytes), maximum: 8.61 GiB. OvercommitTracker decision: Query was selected to stop by OvercommitTracker.: While executing JoiningTransform. (MEMORY_LIMIT_EXCEEDED) (version 23.3.1.2823 (official build) Error when joining MergeTree() tables

The largest table contains around 120million rows and the total size of these tables should be no more than 250mil rows.

Is there any way to improve this query otherwise execute it without running out of memory? Can clickhouse be set to spill to the disk when ram reaches its limit (I know this will be slower but at this point we just need the tables to join)?

  • Check these ones - https://stackoverflow.com/questions/64593901/clickhouse-join-with-condition/64597112#64597112, https://stackoverflow.com/questions/65316905/clickhouse-dbexception-memory-limit-for-query-exceeded/65317795#65317795. – vladimir Apr 21 '23 at 16:52

1 Answers1

0

See partial_merge algorithm that will splill to disk and also see https://clickhouse.com/blog/clickhouse-fully-supports-joins-part2 for more recent grace hash join. Also keep small table on right side of join.

gingerwizard
  • 184
  • 5
  • While links may answer the question [it is recommended](//meta.stackexchange.com/q/8259) to include the essential parts of the answer here, and provide the link for reference. – dan1st Apr 24 '23 at 11:09