2

What do I have?

A producer who publishes a stream like this:

- START_JOB 1
- do_task 1_1
- do_task 1_2
[...]
- do_task 1_X
- END_JOB 1

- START_JOB 2
- do_task 2_1
- do_task 2_2
- END_JOB 2

- START_JOB 3
- do_task 3_1
- do_task 3_2
[...]
- do_task 3_X
- END_JOB 3

Or in other words sequences of:

  • START_JOB <job_nr>
  • a random number of do_task <job_nr>_<task_nr>
  • END_JOB <job_nr>

What do I want?

A consumer who (after a crash/restart) always starts at the last START_JOB he hasn't acknowledged the matching END_JOB yet, and receives the messages in correct order from there on.

So if the consumer crashes while handling do_task 12_34, then after a restart he should start with START_JOB 12 and then get all messages from there on in chronological order.

What doesn't work?

Even after trying different combinations of:

  • push/pull consumers
  • Using "AckPolicy All" and only send ack on END_JOB
  • Playing with "AckWait" and adding a sleep of "AckWait"+2s before starting the consumer.

I sometimes still get the wrong message first after restart.

What do I think could be the problem?

I currently think that it has something to do with the acknowledgement settings (and with messages of the current "START_JOB" -> "END_JOB" sequence still waiting to expire).

My impression is that playing with these settings has improved the situation, but there are still issues depending on how fast my producer is, how much time passes between START_JOB and END_JOB (and respectively between each do_task).

But maybe I'm looking at the wrong thing here, and the solution for my problem is something entirely different.

rmweiss
  • 716
  • 1
  • 6
  • 16

0 Answers0