I have a process which spawns a producer and consumer in separate pods (kubernetes). I want to use auto.offset.reset "latest"
, and thus need to guarantee that the consumer pod spins up prior to the producer pod, as I do not want the producer to begin producing messages before the consumer pod comes online.
My simple approach is to have the process which spawns these pods to create the consumer group prior to either of the pods spawning. In testing, I noticed that the consumer group state goes from stable to empty after about ~45 seconds and then the consumer group is variably removed after anywhere from another 30 seconds to a few minutes.
How can I guarantee that the consumer group created stays around for longer?
My offsets.retention.minutes
is the default of 7 days, as per https://stackoverflow.com/a/65189562/9992341.
I am using python's confluent_kafka package to create the group (there appears to be no direct api to create a group id), and I have tried messing around with the subscribe callable params.
from confluent_kafka import Consumer
consumer = Consumer(
{
"sasl.username": "***",
"sasl.password": "***",
"bootstrap.servers": "***",
"group.id": "test-group-1",
"security.protocol": "SASL_SSL",
"sasl.mechanisms": "PLAIN",
"auto.offset.reset": "latest",
},
)
consumer.subscribe([topic]) #, on_lost=lambda *args: None)
I run the above code and check in a separate script with the admin client the group:
import confluent_kafka.admin
admin_client = confluent_kafka.admin.AdminClient(
{
'sasl.username': "***",
'sasl.password': "***",
'bootstrap.servers': "***",
'security.protocol': 'SASL_SSL',
'sasl.mechanisms': 'PLAIN',
}
)
def list_groups(admin_client):
future = admin_client.list_consumer_groups()
res = future.result()
lst = [(i.group_id, i.state) for i in res.valid]
for i in sorted(lst):
print(i) # noqa: T201
list_groups(admin_client)
# ('test-group-1', <ConsumerGroupState.STABLE: 3>)
However as stated this group's state pretty quickly becomes "EMPTY" and disappears, even though the retention should be 7 days (which is overkill for my use case where pods come up pretty close together).
Note: I have tested this while messages were being produced to the topic and not produced, but no change is observed.