I would like to process tuples in batches for which I am in a thought of using Trident API. However, there are no operations that I perform in batches here. Every tuple is processed individually. All that I need here is exactly-once semantics so that every tuple is processed only once and this is the only reason to use Trident.
I want to store the information of which tuple is processed so that when a batch is replayed, the tuple will not be executed when that is already processed.
The topology contains a persistentAggregate() method, but it takes some aggregation operation but I don't have any aggregation operation to perform on a set of tuples as every tuple is processed individually.
Here, the functions that a tuple undergoes are too minute to be executed. So, I am looking to process them in batches in order to save computing resources and time.
Now, how to write a topology which consumes tuples as batches but still doesn't perform any batch operations (like word count)?