What do I have?
A producer who publishes a stream like this:
- START_JOB 1
- do_task 1_1
- do_task 1_2
[...]
- do_task 1_X
- END_JOB 1
- START_JOB 2
- do_task 2_1
- do_task 2_2
- END_JOB 2
- START_JOB 3
- do_task 3_1
- do_task 3_2
[...]
- do_task 3_X
- END_JOB 3
Or in other words sequences of:
START_JOB <job_nr>
- a random number of
do_task <job_nr>_<task_nr>
END_JOB <job_nr>
What do I want?
A consumer who (after a crash/restart) always starts at the last START_JOB
he hasn't acknowledged the matching END_JOB
yet, and receives the messages in correct order from there on.
So if the consumer crashes while handling do_task 12_34
, then after a restart he should start with START_JOB 12
and then get all messages from there on in chronological order.
What doesn't work?
Even after trying different combinations of:
- push/pull consumers
- Using "AckPolicy All" and only send
ack
onEND_JOB
- Playing with "AckWait" and adding a sleep of "AckWait"+2s before starting the consumer.
I sometimes still get the wrong message first after restart.
What do I think could be the problem?
I currently think that it has something to do with the acknowledgement settings (and with messages of the current "START_JOB" -> "END_JOB" sequence still waiting to expire).
My impression is that playing with these settings has improved the situation, but there are still issues depending on how fast my producer is, how much time passes between START_JOB
and END_JOB
(and respectively between each do_task
).
But maybe I'm looking at the wrong thing here, and the solution for my problem is something entirely different.