I have a stream of documents, that go through multiple processing steps. These steps are done in parallel. After each step completes, a message is sent to stage completion
topic. After all the steps are done, the tracker sends a message to processing complete
topic with the document Id
.
I am using kafka streams (with spring cloud stream on top) in the tracker implement the above functionality.
Following is the sample code.
@StreamListener
@SendTo("processingComplete")
public KStream<String, String> onCompletion(@Input("stageCompletion")
KStream<String, String> stageCompletionStream) {
return stageCompletionStream
.filter(this::checkValidity)
.groupByKey(Serialized.with(Serdes.String(), Serdes.String()))
.reduce(this::aggregateStageCompletion,
Materialized.as("stage_completion_store"))
.toStream()
.filter((ignored, message) -> checkCompletion(message))
.map(this::publishCompletion);
}
After I publish completion message, I need to clean up the state store - stage_completion_store
(which happens to be rocks db by default) of that document Id
.
The suggested approach is to insert a tombstone message; to do so I have additionally implemented another stream to read processing complete
topic and merge the same with stage completion
stream.
Follow is the code using this approach.
@StreamListener
@SendTo("processingComplete")
public KStream<String, String> onCompletion(@Input("stageCompletion")
KStream<String, String>
stageCompletionStream,@Input("processingCompleteFeed") KStream<String,
String> processingCompletionStream){
return processingCompletionStream.merge(stageCompletionStream)
.filter(this::checkValidity)
.groupByKey(Serialized.with(Serdes.String(),Serdes.String()))
.reduce(this::aggregateStageCompletion,
Materialized.as("stage_completion_store"))
.toStream()
.filter((ignored,message)->checkCompletion(message))
.map(this::publishCompletion);
}
The aggregateStageCompletion
inserts the tombstone(returns null
) when the message is a processing completion message.
Is this a good way to do it - read a stream just to mark tombstone? or is there a better approach to achieve the same?