0

I have an application, using the HSQL database, that at very rare ocassions needs to copy a lot of data (5 million rows+) into the database. On an i7 this takes about 3 hours which is perfectly fine.

The issue I have is that on a weaker CPU like an i3 it is not possible to do this copy task at all since the CPU usage is at 100% at all cores and as a consequence the whole application freezes.

I'm looking for a solution to "throttle" the data entering process. It's totally okay if the copy process takes much longer as long as it completes and doesn't freeze the application.

I have been looking through the official documentation here: http://hsqldb.org/doc/guide/guide.html but couldn't find what I was looking for.

What would be the best approach to get this task working also on weaker CPUs?

Markus
  • 1,452
  • 2
  • 21
  • 47
  • 1
    The common approach is to add a pause to your Java program after every 10 rows inserted. Thread.sleep() perhaps. http://stackoverflow.com/questions/1036754/difference-between-wait-and-sleep – fredt Jan 12 '17 at 12:16
  • I have tried to add a sleep of 5 seconds after each 100 rows which gained me an average of 10% cpu cycle time. I guess this is the only suitable approach then! – Markus Jan 12 '17 at 12:56
  • 1
    Actual CPU usage should be a lot less if it is working for 1s or less then waiting for 5s. Look at total CPU time report for the process and see if it is reported correctly. – fredt Jan 12 '17 at 13:39
  • I have measured it with windows process explorer. The thing is I have 50 threads (Each with their own dataabse connection) and each needs to enter about 100.000 rows at the same time into different tables. As I said I tell each thread to sleep for 5 seconds after entering 100 rows and I only see at maximum a 10% improvement. – Markus Jan 12 '17 at 15:01
  • 1
    This is the slow way to populate the database. Use one thread to populate table by table and it will be much faster overall and use much less CPU. – fredt Jan 12 '17 at 22:03
  • I'm fully aware of this. The problem is that there is no single data source I get my data from. I get my data from 50 different physical industrial machines and a top requirement is that the update of one machine does not block the update from another machine. I'd love to only have a single thread to do the work. AS a fact I already implemented such a solution but it got rejected since the update of one machine blocks the update of all other machines until its done :/ Thanks for your input anyway, maybe I can make something more clever ;) – Markus Jan 13 '17 at 06:01

0 Answers0