The following problem occurred in our project, which we cannot solve. We have a huge data of our logs, and we go to ClickHouse from MongoDB.
Our table is created like this:
CREATE TABLE IF NOT EXISTS logs ON CLUSTER default (
raw String,
ts DateTime64(6) MATERIALIZED toDateTime64(JSONExtractString(raw, 'date_time'), 6),
device_id String MATERIALIZED JSONExtractString(raw, 'device_id'),
level Int8 MATERIALIZED JSONExtractInt(raw, 'level'),
context String MATERIALIZED JSONExtractString(raw, 'context'),
event String MATERIALIZED JSONExtractString(raw, 'event'),
event_code String MATERIALIZED JSONExtractInt(raw, 'event_code'),
data String MATERIALIZED JSONExtractRaw(raw, 'data'),
date Date DEFAULT toDate(ts),
week Date DEFAULT toMonday(ts)
)
ENGINE ReplicatedReplacingMergeTree()
ORDER BY (device_id, ts)
PARTITION BY week
and I'm running a query like so
SELECT device_id,toDateTime(ts),context,level,event,data
FROM logs
WHERE device_id = 'some_uuid'
ORDER BY ts DESC
LIMIT 10
OFFSET 0;
this is the result 10 rows in set. Elapsed: 6.23 sec.
And second without order, limit and offset:
SELECT device_id,toDateTime(ts),context,level,event,data
FROM logs
WHERE device_id = 'some_uuid'
this is the result Elapsed: 7.994 sec. for each 500 rows of 130000+
Is too slow.
Seems that CH process all the rows in the table. What is wrong and what need to improve the speed of CH?
The same implementation on MongoDB takes 200-500ms max