0

I have a text processing application in Java, which reads a file chunk by chunk (~100000 lines) and processes each chunk in a separate thread.

It works well, but there is an issue. Reading lines is much faster than processing them and program ends up with a queue of Runnables waiting for their turn. That costs some memory which I intend to save up.

I would like the program to behave that way:

  • read 16 chunks and submit them to 8 runnables;
  • if number of unprocessed chunks falls below 12 read 4 more chunks of text.

That will keep Runnables busy, but at the same time keep save memory for processing (instead of storing chunks).

How do I do it in Java? Written in preudocode I want this:

loop {

  chunk = readChunkOfData();

  counter.inc();    

  processAsync(chunk);

  if (counter.isBiggerThan(16)) {
    counter.sleepWhileCounterIsBiggerThan(12);
  }
}

...

worker {
  // do the job

  counter.dec();
}
Denis Kulagin
  • 8,472
  • 17
  • 60
  • 129

1 Answers1

1

As Marko Topolnik commented, using bounded (blocking) queues can solve your problem elegantly.

You don't need a counter since the queue knows its limits, and your pseudocode would end up looking something like the following

loop {
    chunk = readChunkOfData();
    queue.put(chunk);
}

worker {
    chunk = queue.take();
    process(chunk);
}

This assumes that queue is for example new ArrayBlockingQueue(16); and is shared by all the workers. You can also use the drainTo(Collection<? super E> c, int maxElements) in the workers to take multiple chunks at once, as an additional workbuffer on the worker side, but that probably wouldn't make much of a difference.

Kayaman
  • 72,141
  • 5
  • 83
  • 121