A custom processor which buffers events in a simple java.util.List
in process()
- this buffer is not a state store.
Every 30 seconds WALL_CLOCK_TIME, punctuate()
sorts this list and flushes to the sink. Assume only single partition source and sink. EOS processing guarantee is required.
I know that at any given time either process()
gets executed or punctuate()
gets executed.
I am concerned about this buffer not being backed by changelog topic. Ideally I believe this should have been a state store to support EOS.
But there is an argument that setting commit.interval
to more than 30 seconds - i.e. say 40 seconds, will make sure that the events in the buffer would never be lost. And also since we are using WALL_CLOCK_TIME
, the punctuate()
will always be called every 30 seconds regardless of whether we have events are not.
Is this a valid argument? What are the cases here that will make the events in the buffer lost forever?
@Override
public void init(ProcessorContext processorContext) {
super.init(processorContext);
this.buffer = new ArrayList<>();
context().schedule(Duration.ofSeconds(20L), PunctuationType.WALL_CLOCK_TIME, this::flush);
}
void flush(long timestamp){
LOG.info("Punctuator invoked.....");
buffer.stream().sorted(Comparator.comparing(o -> o.getId())).forEach(
i -> context().forward(i.getId(), i)
);
}
@Override
public void process(String key, Customer value) {
LOG.info("Processing {}", key);
buffer.add(value);
}