0

I need to reload data for the last day in ClickHouse. My idea is to delete rows for the last day using ALTER TABLE DELETE statement and then insert the updated rows.

I've already found in the documentation: "Mutations are also partially ordered with INSERT INTO queries: data that was inserted into the table before the mutation was submitted will be mutated and data that was inserted after that will not be mutated." But I don't understand this sentence. So should I wait for the asynchronous delete process to end before I start doing my inserts or is it fine to start inserting right away? It's interesting to know the answer for both cases with replication and without.

vladimir
  • 13,428
  • 2
  • 44
  • 70
Stanislav
  • 4,389
  • 2
  • 33
  • 35
  • If DELETEs can cover inserted data you should definitely wait for finishing DELETE mutations before making INSERTs to avoid unexpected removing just inserted data. If I were you I would wait for the finishing of delete completion in any case. – vladimir Dec 30 '20 at 18:49
  • You don't need to wait. When mutation is submitted CH server creates a list of parts that should be mutated. New inserts are not in this list. – Denny Crane Dec 30 '20 at 21:35

1 Answers1

6

You don't need to wait.

When mutation is submitted CH server creates and saves a list of parts that should be mutated. You can see this list is in system.mutations block_numbers.partition_id column. After that Alter returns a control to a client.

New inserts (parts) are not in this list because they are not created yet.

Mutations itself start later, asynchronously and process parts from the stored list.

Denny Crane
  • 11,574
  • 2
  • 19
  • 30