Kafka ensuring consumer group stays alive

Question

I have a process which spawns a producer and consumer in separate pods (kubernetes). I want to use auto.offset.reset "latest", and thus need to guarantee that the consumer pod spins up prior to the producer pod, as I do not want the producer to begin producing messages before the consumer pod comes online.

My simple approach is to have the process which spawns these pods to create the consumer group prior to either of the pods spawning. In testing, I noticed that the consumer group state goes from stable to empty after about ~45 seconds and then the consumer group is variably removed after anywhere from another 30 seconds to a few minutes.

How can I guarantee that the consumer group created stays around for longer?

My offsets.retention.minutes is the default of 7 days, as per https://stackoverflow.com/a/65189562/9992341.

I am using python's confluent_kafka package to create the group (there appears to be no direct api to create a group id), and I have tried messing around with the subscribe callable params.

from confluent_kafka import Consumer

consumer = Consumer(
    {
        "sasl.username": "***",
        "sasl.password": "***",
        "bootstrap.servers": "***",
        "group.id": "test-group-1",
        "security.protocol": "SASL_SSL",
        "sasl.mechanisms": "PLAIN",
        "auto.offset.reset": "latest",
    },
)
consumer.subscribe([topic]) #, on_lost=lambda *args: None)

I run the above code and check in a separate script with the admin client the group:

import confluent_kafka.admin

admin_client = confluent_kafka.admin.AdminClient(
    {
        'sasl.username': "***",
        'sasl.password': "***",
        'bootstrap.servers': "***",
        'security.protocol': 'SASL_SSL',
        'sasl.mechanisms': 'PLAIN',
    }
)

def list_groups(admin_client):

    future = admin_client.list_consumer_groups()
    res = future.result()
    lst = [(i.group_id, i.state) for i in res.valid]
    for i in sorted(lst):
        print(i) # noqa: T201

list_groups(admin_client)
# ('test-group-1', <ConsumerGroupState.STABLE: 3>)

However as stated this group's state pretty quickly becomes "EMPTY" and disappears, even though the retention should be 7 days (which is overkill for my use case where pods come up pretty close together).

Note: I have tested this while messages were being produced to the topic and not produced, but no change is observed.

"I do not want the producer to begin producing messages before the consumer pod comes online" doesn't this defeat the purpose of an asynchronous messaging system? Why is that your requirement? — user2340612, Aug 10 '23 at 15:30
the only reason is because I have a lot of messages in my topic, and I am creating a new consumer group and if I start from earliest it has to go through a lot of messages to start from earliest. If I can just create the consumer groups right before the pods come online, then I can consume from latest and not have to go through all those messages — bbd108, Aug 10 '23 at 17:13
Is there something preventing you to reuse the same consumer group? That way you’d resume from where you left — user2340612, Aug 10 '23 at 17:33
so this is for a new consumer group. I have a topic where periodically new messages per a given "event" are produced continuously across a variable amount of time (hours). Each consumer consumes from that topic but with a different group id since there are multiple consumers consuming from the same event, but per event id. So I have test-group-1 consuming, and then when a new event id comes in, messages start getting produced and consumed, only with a new group id say test-group-2 — bbd108, Aug 10 '23 at 17:48
This sounds like an anti-pattern in Kafka. You can order data by event-id in a partition, then use `consumer.assign()` to read specific partitions instead of subscribing to all partitions and messing with groups (assign doesn't use consumer groups). You can also use `seek` functions to go to the end when the consumer starts rather than use its startup lifecycle to trigger a producer. — OneCricketeer, Aug 10 '23 at 20:07

score 1 · Answer 1 · edited Aug 10 '23 at 15:24

1

The settings auto.offset.reset only applies when Kafka does not have any offset information committed. You can start the consumer with auto.offset.reset = earliest. Only on the first run of the consumer it will consume from the beginning. On subsequent runs it will start at the last committed offset positions.

Regarding the disappearing of your consumer group. It's because there is no offset committed for this group.

edited Aug 10 '23 at 15:24

Alexandre Juma

3,128
1
20
46

answered Aug 10 '23 at 07:40

roccomathijn

61
6

So I have done earliest, it just seems inefficient to loop through all those messages I know are not relevant. If I create the consumer groups immediately before the pods spin up, then it would go through fewer messages. To commit an offset for this group, I am using auto commit, I have tried doing `consumer.poll()` but the group still disappears – bbd108 Aug 10 '23 at 17:15

Kafka ensuring consumer group stays alive

1 Answers1