0

How to correctly using async insert settings in clickhouse?

I am following the documentation on clickhouse website here: https://clickhouse.com/docs/en/optimize/asynchronous-inserts to use async inserts.

I wanted to apply async insert settings at user level. I used approach mentioned here https://clickhouse.com/docs/en/optimize/asynchronous-inserts#enabling-asynchronous-inserts , verified that async insert is enabled from system.settings table in clickhouse.

Now when I insert into tables using this user logged in on the server, doing it via console only using INSERT INTO table() values () command. It inserts the record into table but this insertion log doesn't appear in asynchronous_insert_log table.

When can this happen? What all logs I can use to debug this.

Update: When I do INSERT INTO Table SETTINGS async_insert=1 wait_for_async_insert=1 values (), then it actually shows up in asynchronous_insert_log table else not.

  • Not sure why the records would be deleted. You might have to share your table DDL – Rich Raposa Aug 28 '23 at 14:17
  • Got the issue for the deleted rows, it was my mistake, I had a ttl attached on the table rows on a datetime field for 3 months, and I was not passing it when testing async inserts, so it took default value of 1970-01-01 and when clickhouse's merge process ran, it deleted that row. – Shikhar Tyagi Aug 28 '23 at 15:11

1 Answers1

2

This enables async inserts at the user level:

ALTER USER default SETTINGS async_insert = 1

But that doesn't enable the logging. You have to configure the asynchronous_insert_log by adding something like the following in a new file in config.d:

<clickhouse>
    <asynchronous_insert_log>
        <database>system</database>
        <table>asynchronous_insert_log</table>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
        <max_size_rows>1048576</max_size_rows>
        <reserved_size_rows>8192</reserved_size_rows>
        <buffer_size_rows_flush_threshold>524288</buffer_size_rows_flush_threshold>
        <flush_on_crash>false</flush_on_crash>
        <engine>Engine = MergeTree PARTITION BY event_date ORDER BY event_time TTL event_date + INTERVAL 30 day</engine>
    </asynchronous_insert_log>
</clickhouse>

More details in the docs: https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings#server_configuration_parameters-asynchronous_insert_log

Rich Raposa
  • 190
  • 4
  • This configuration is already there in my config.xml file – Shikhar Tyagi Aug 28 '23 at 14:55
  • So are you seeing entries in this log table now? – Rich Raposa Aug 28 '23 at 16:02
  • When I do `INSERT INTO Table SETTINGS async_insert=1 wait_for_async_insert=1 values ()`, then it actually shows up in asynchronous_insert_log table else when I am doing `insert into table () values ()`, it doesn't show up in asynchronous_insert_log table. – Shikhar Tyagi Aug 28 '23 at 17:19
  • So it's a setting issue - not an async_insert issue. For whatever reason, your `async_insert=1` is not getting applied. Make sure you're sending the query as the `default` user (since that is who you applied the setting to) or send the setting with each insert using the `SETTINGS` clause – Rich Raposa Aug 29 '23 at 20:31