4

I'm working/preparing an application that is based on a producer/consumer model. In my case there will be one producer which generates several million (non-trivial) tasks, and there will be a configurable number of consumers.

Communication between producer and consumers is basically Queue-based. I'm worried however about memory-consumption: it's quite conceivable that the number of tasks will exceed the available memory for the JVM. So I'd like to have a Queue implementation that only keeps the "top-X" number of queue items in memory, and stores the rest on disk. This does not have to be resilient in that it does not need to survive a restart of the program.

I've searched around, but can't find a Queue implementation that seems to be in widespread use (there do seem to be a few, what I call, proof-of-concept implementations, but I'm worried about future support/continued development of those implementations). I'm also looked at external Messaging Queue applications, but (1) I don't want to run a second external process and (2) even embedding a message broker inside the same JVM process seems a bit "top heavy" for this requirement.

Does anybody know of any well-supported future-proof library out there that provides this functionality?

Rgds

Maarten Boekhold
  • 867
  • 11
  • 21
  • did you ever find a reliable solution for your problem. I am looking to solve an identical situation. Thanks! – joe May 02 '16 at 19:25

2 Answers2

1

Well, JMS seems like the obvious solution. I don't think you'll find something solid to solve this problem since JMS solves it and is a standard solution.

Note however that Java also has BoundedQueues to solve this problem: dimension the queue to make sure it won't fail with an OOME when the queue is full, and producers will be blocked while trying to put a message in the full bounded queue until some task is removed from the queue by one of the consumers.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
  • True, I have considered using a BoundedQueue (and still haven't ruled it out). It just sounds so, well, inefficient to make the producer block. – Maarten Boekhold Jan 01 '12 at 12:20
  • Btw. I do disagree with the assumption that since JMS exists and solves this that there is no room for another solution/implementation. JMS is seriously way overkill for something like this. For example, this "memory-restricted queue" might be a requirement in an embedded system, where there just is no space/resources for a JMS message broker. – Maarten Boekhold Jan 01 '12 at 12:23
  • If producers go too fast for the consumers, making the producers block will allow consumers to have more CPU time (or bandwidth) allocated to them, so the overall time spent to produce and consume a number of tasks could just be the same than with a disk-backed queue (or even lower, since you avoid the need to read and write from/to the disk) – JB Nizet Jan 01 '12 at 12:24
  • 1
    To me, if the queue fills up so quickly, it's probably that the ratio between the number of producers and the number of consumers is not appropriate. The queue is useful to have some flexibility, but its goal is to regulate the pace, not just to store tasks. – JB Nizet Jan 01 '12 at 12:27
  • True, letting the producer(s) block might be a better/faster solution for my problem than using a queue that overflows to disk. We'll never know for sure without testing however ;) – Maarten Boekhold Jan 01 '12 at 12:33
0

Having the producer block when there is more than enough tasks to be consumed is usually more efficient than letting the queue grow and consume more memory. e.g. if your queue fits in the queue it can be several times faster than a queue which doesn't.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130