1

I am using Galera Cluster with 3 nodes. I am currently meet following problems. I want to write more than 500 millions of records into database, for example table Data. Here is following steps:

  • Create table NewData with schema as Data but without index.
  • Write 500 millions records into this table. (using multiple threads to write, each thread will write bunch of records).
  • After finishing, assign index for this table.
  • Rename Data to OldData and rename NewData to Data.

The problem I am currently have is at indexing phrase, other services cannot write / read data. After I increase innodb_buffer_pool_size other nodes can read data but still cannot write.

I have configured so that written job writes at different node than other api's but problem still the same. I think that if one node is very high workload, other node should still behave normally. Please tell me why and how to fix this.

Thanks

Trần Kim Dự
  • 5,872
  • 12
  • 55
  • 107

1 Answers1

3

I think you missed a step.

  1. (one-time) Create table NewData with schema as Data but without index.
  2. Insert into NewData.
  3. Create table `Empty (again like Data but without any index)
  4. RENAME TABLE NewData TO ToIndex, Empty TO NewData; -- Now the ingestion can proceed.
  5. ALTER TABLE ToIndex ADD INDEX ...
  6. RENAME TABLE Data TO Old, ToIndex TO Data;

The point is to have two things going on:

  • Continually writing to the unindexed NewData.
  • Swap tables around so that periodically that table (under a new name) gets indexed, and then used to replace the live table (which is always seen as Data).

This is not quite the same situation, but has some similarities: http://mysql.rjweb.org/doc.php/staging_table

Rick James
  • 135,179
  • 13
  • 127
  • 222