0

The table

CREATE TABLE events
(
    site_id UInt64,
    name    String
    -- other columns
)
ENGINE = CollapsingMergeTree(sign_flag)
PARTITION BY site_id
ORDER BY (name)
SETTINGS index_granularity = 8192;

The query

SELECT 'wtf',
       *
FROM events
WHERE site_id = 1 AND
      name = 'some_name'
LIMIT 100000;

The log

SELECT formatReadableSize(read_bytes)    AS read_bytes,
       formatReadableSize(memory_usage)  AS memory_usage,
       formatReadableQuantity(read_rows) AS read_rows,
       query_duration_ms / 1000 AS query_duration_sec,
       query
FROM system.query_log
WHERE query LIKE '%wtf%'
ORDER BY
    event_time DESC
LIMIT 100;
+------------+--------------+--------------+--------------------+
| read_bytes | memory_usage | read_rows    | query_duration_sec |
+------------+--------------+--------------+--------------------+
| 578.41 MiB | 131.95 MiB   | 1.01 million | 10.773             |
+------------+--------------+--------------+--------------------+

I think there are very large numbers in the log.

How to optimize it or I miss something about server config ?

cetver
  • 11,279
  • 5
  • 36
  • 56
  • 2
    Consider defining another primary key - for this query *ORDER BY (name, site_id)*. Choosing PK is a pretty important part of the design, to choose right one need to observe all picture of use cases (see https://clickhouse.tech/docs/en/engines/table-engines/mergetree-family/mergetree/#selecting-the-primary-key)(+ https://stackoverflow.com/a/62557177/303298). – vladimir Feb 05 '21 at 19:41
  • 1
    If you query `* ` from a table with 800 columns CH can eat 10GB RAM to read 1 row. Because of buffers for each column. CH creates >2MB buffers per column. – Denny Crane Feb 05 '21 at 19:46
  • @vladimir After changing PK `read_rows = 106.50 thousand | query_duration_sec = 1.175`. Please move your comment to answer, I'll accept it. – cetver Feb 09 '21 at 09:48

1 Answers1

1

Consider defining another primary key - for this query ORDER BY (name, site_id).

Choosing PK is a pretty important part of the design, to choose right one need to observe all picture of use cases. See for more details:

vladimir
  • 13,428
  • 2
  • 44
  • 70