-1

We have a product that requires writing 1000+ records to a database table once approved by a staff member in our company. The traditional way of writing this number of records at once could be to loop or MySQL bulk insert to the table directly.

Along with this I also have couple of tables that are being checked by a CRON job and update another table with 2000 records at once.

I Would love to know if I should be proceeding with a MySQL bulk insert (Which is a performance impact) or use an event processing tool like Kafka?

Dharman
  • 30,962
  • 25
  • 85
  • 135
Pavan Welihinda
  • 605
  • 3
  • 11
  • 26
  • 1
    A thousand records shouldn't take all that long. Have you *measured* the performance impact? – ceejayoz May 23 '17 at 21:55
  • I haven't tested a performance comparison because we don't have enough resources. Now the question is, after writing all the data to one table, reading will also consume time as time goes by. Wondering if I should stick to MySQL or go for a nosql since this table will be used for reading data. What do you think? – Pavan Welihinda May 24 '17 at 02:17
  • 1
    I think you're jumping to unnecessary optimizations. MySQL can handle billions of rows, if properly indexed and on the right hardware. I'm a bit baffled by the idea that you don't have the resources to run the bulk insert once or twice to see how long it takes - surely that takes fewer resources than adopting NoSQL or adding Kafka?! – ceejayoz May 24 '17 at 02:20
  • That is correct. I meant we don't have enough resources to do a total performance comparison of different ways of writing data including things like kafka. Because whatever I do has to be sold for couple of years. I will check the time for bulk insert either way. – Pavan Welihinda May 24 '17 at 02:26

1 Answers1

1

I sense a trick that can speed this up faster than LOAD DATA, which is probably the fastest...

"once approved by ..." -- Does this mean that you have the data sitting around somewhere, and you want to "push a button" to get them added to a particular table? If so,...

Preload them into another table. Then have this query ready to run:

INSERT INTO real_table
    SELECT * FROM pending_table;

But... Are the columns lined up correctly? Can there be dup keys? If so do you need to update something? See IODKU. Etc.

I recommend you test the process before the day comes. It could be embarrassing if it fails.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Correct. The data will be siting on a table. Weekly one of our employees will approve those data. It should be then pushed to this new table. The new table isn't an exact replica of the initial table. It will be more like an auditing table which will have less columns but with different column names. – Pavan Welihinda May 24 '17 at 02:20