I am implementing Kafka delayed topic consumption with consumer.pause(<partitions>)
.
Pub/Sub Kafka shim turns pause into a NoOp:
Is there any documentation on how to delay consumption of a pub sub lite topic by a set duration?
i.e. I want to consume all messages from a Pub/Sub Lite topic but with a synthetic 4 minute lag.
Here is my algorithm with Kafka native:
- call
consumer.poll()
- resume all assigned partitions
consumer.resume(consumer.assignment())
- combine previously
delayed
records with recently polled records - separate records into
- records that are old enough to process
- records still too young to process
- pause partitions for any records that are too young
consumer.pause(<partitions of too young>)
- keep a buffer of too young records to reconsider on the next pass, called
delayed
- processes records that are old enough
- rinse, repeat
We only commit offsets of records that are old enough, if the process dies any records in the “too young” buffer will remain uncommitted and they will be revisited by whichever consumer receives the partition in the ensuing rebalance.
Is there a more generalized form of this algorithm that will work with native Kafka and Pub/Sub Lite?
Edit: CloudTasks is a bad idea here as it disconnects the offset commit chain. I need to ensure I only commit offsets for records that have gotten an ack from the downstream system.