Questions tagged [clickhouse]

ClickHouse is an open-source column-oriented DBMS for real time analytical reporting which has Capability to store and process petabytes of data.

ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real time.

1835 questions
0
votes
0 answers

GROUP BY date and empty data

I have table hits with columns created and user_id. I want get stats hits count for last 30 days, GROUP BY day. But I have problem, because some days user dont have traffic. And as a result, I do not see this day in the report. How to get data for…
Alex
  • 65
  • 1
  • 4
0
votes
0 answers

What datatypes should I use to read Jaeger logs from Kafka into Clickhouse?

I'm new to Clickhouse. I'm trying to read Jaeger logs from Kafka into Clickhouse db. I have following Kafka messages format: { "traceId": "omFv9AGFHOAfWQ+tJcxDZQ==", "spanId": "Lai3jc8v6Pg=", "operationName": "GET", "startTime":…
Timur
  • 31
  • 1
  • 6
0
votes
1 answer

Extract and sum values with subfields inside string using ClickHouse

I have a ClickHouse database with a simple table with two fields (pagePath string, pageviews int) I want sum visitis for each filter + value, to know pageviews by users in filters (most used filter) . Filters are separated by comma Example…
lino
  • 163
  • 2
  • 10
0
votes
0 answers

How to do small queries efficiently in Clickhouse

In our deployment, there are one thousand shards. The insertions are done via a distributed table with sharding jumpConsistentHash(colX, 1000). When I query for rows with colX=... and turn on send_logs_level='trace', I see the query is sent to all…
Hieu Nguyen
  • 382
  • 2
  • 15
0
votes
2 answers

clickhouse modify arrays inside arry

Given the following query: SELECT arrayZip(Groups.Names, Groups.Scores) AS ZipGroups, arrayZip(Symbols.Names, Symbols.Scores) AS ZipSymbols FROM rspamd WHERE (IsBayes = 'spam') AND ((Action = 'no action') OR (Action = 'greylist')) AND (Score…
Christian Rößner
  • 447
  • 1
  • 5
  • 18
0
votes
2 answers

Regular Expression to search substring

let's say I have a string like Michael is studying at the Faculty of Economics at the University and I need to check if a given string contains the following expression: Facul* of Econom* where the star sign implies that the word can have many…
Baurzhan
  • 207
  • 4
  • 13
0
votes
1 answer

OOM error when setting up cache dictionaries in clickhouse

Running a centos 7.6 VM with 16Gb of RAM I get the following error Code: 32. DB::Exception: Attempt to read after eof: while receiving packet from localhost:9000 when querying a cache dictionary with 75 columns and 100000 rows (126Mb). Is this to…
0
votes
0 answers

Clickhouse find pairs of events by sequental time and specific types

I have events table in clickhouse. When some user (defined by user_id) come into the room (defined by object_id) or left the room electonic lock opened by key-card should be opened so for each interaction with the lock there are a record in events…
MihanEntalpo
  • 1,952
  • 2
  • 14
  • 31
0
votes
1 answer

Apache Pulsar Clickhouse Sink - does it have intervals between inserts?

Clickhouse allows high performance writes but only if they are done in bulk and with intervals (recommended is at least 1 second interval between inserts). In the documentation to JDBC connector for Clickhouse batchSize option exists but there is…
Andrii Rusanov
  • 4,405
  • 2
  • 34
  • 54
0
votes
1 answer

Reduce resource consumption in ClickHouse

The table CREATE TABLE events ( site_id UInt64, name String -- other columns ) ENGINE = CollapsingMergeTree(sign_flag) PARTITION BY site_id ORDER BY (name) SETTINGS index_granularity = 8192; The query SELECT 'wtf', * FROM…
cetver
  • 11,279
  • 5
  • 36
  • 56
0
votes
0 answers

Сan't remove duplicates from the table ClickHouse

We have a replicated shard table, on the engine ReplicatedMergeTree. Now on one shard (3 in total) in the table there are 484 million rows, about 21GB. Engine deduplication does not work on that many lines, optimization also hangs. Maybe someone…
Di_roman
  • 21
  • 2
0
votes
2 answers

Add identifier of first created record to select statement with group_by

I have the following payments table ┌─name───────────────────────────┬─type────────────────────────────┐ │ payment_id │ UInt64 │ │ factory │ String │ │…
dehelden
  • 39
  • 6
0
votes
2 answers

Benchmarking cache dictionary leads to "Unexpected EOF while reading bytes"

I have Clickhouse version 20.8.3.18 and python3 installed on a vm stress testing Cache dictionaries. After a certain number of entries the query using clickhouse_driver, I'll get the error Unexpected EOF while reading bytes Is this an error due to…
0
votes
0 answers

How to pass time function data in clickhouse

I have mysql table which consists of fields (from_time, to_time) having time data type. Here is below my table description and some data from that table. I want to migrate mysql data to clickhouse. Clickhouse accepts the time data type field as a…
RohanJ24
  • 21
  • 4
0
votes
1 answer

How to attach multiple partitions in ClickHouse?

Used below query to attach one partition in ClickHouse. alter table a_status_2 attach partition '20210114' from a_status_1; How to attach multiple partitions in ClickHouse?