0

How can I write messages in producer at once and read 1 message per minute with consumer?

config properties I can use

enter image description here

note : please "max.poll.records" note that I cannot use the method

My consumer class:

var settings = ConfigurationManager.KafkaSettings.Topics[Topics.FaturaKaydetViaTp];

LogManager.Logger.Debug("Consumer initiating for {topic}", settings.TopicName);

using (var consumer = new ConsumerBuilder<Ignore, MailMessage>(consumerConfig).SetValueDeserializer(new ObjectDeserializer<MailMessage>()).Build())
{
    LogManager.Logger.Debug("Consumer initiated");

    LogManager.Logger.Debug("Subscribing for {topic}", settings.TopicName);
    
    consumer.Subscribe(settings.TopicName);

    try
    {
        while (true)
        {
            try
            {
                
                var cr = consumer.Consume();

                LogManager.Logger.Debug("Message received for '{topic}' at: '{topicPartitionOffset}'.", settings.TopicName, cr.TopicPartitionOffset);
                if (HandleOnMessage(cr.Value))
                    if (ConfigurationManager.KafkaSettings.AutoCommit == false)
                        consumer.Commit(cr);
            }
            catch (ConsumeException e)
            {
                LogManager.Logger.Fatal(e, "ConsumeException");
            }
        }
    }
    catch (OperationCanceledException)
    {
        // Ensure the consumer leaves the group cleanly and final offsets are committed.
        consumer.Close();
    }
}
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Gülsen Keskin
  • 657
  • 7
  • 23
  • "One per minute". Why? Kafka is the wrong tool if you are trying to do blocking batches. This would eventually lead to unnecessary consumer group rebalancing if you wait too long between consume calls – OneCricketeer Jan 07 '22 at 16:30

3 Answers3

2

The general strategy would require you to pause the consumer, sleep the thread running the loop, then re-subscribe / resume the consumer.

This would need to start with setting PartitionsAssignedHandler in the ConsumerBuilder, then saving off the returned partitions assignments, since these are needed for the Consumer.Pause() and Consumer.Resume() methods.

IEnumerable<TopicPartition> partitions; // TODO: assign this from handler
using (var consumer = ... ) { // set PartitionsAssignedHandler in ConsumerBuilder and set partitions above. 

        consumer.Subscribe(settings.TopicName);
        while (true)
        {
            try
            {
                // TODO: Might need to check if currently paused, somehow
                List<TopicPartitionError> resumeErrors = consumer.Resume(partitons);
                // TODO: handle resume errors

                var cr = consumer.Consume(); // This already gets one record from the assigned, resumed partitions

                LogManager.Logger.Debug("Message received for '{topic}' at: '{topicPartitionOffset}'.", cr.Topic, cr.TopicPartitionOffset);

                // Pause and sleep
                consumer.Pause(partitons);
                Thread.Sleep(1000 * 60); // 1 minute
            }
            catch (ConsumeException e)
            {
                LogManager.Logger.Fatal(e, "ConsumeException");
            }
        }

Keep in mind that if you run multiple instances of this, there's always going to be a rebalance when pausing and resuming, which would cause significant delay in processing, maybe causing consumption over one per minute.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
1

First of all you need to set max.poll.records to 1 or else you will fetch all available records on each consumer.poll(Duration).

Also Duration passed to consumer.poll(…) doesn't enforce waiting, it works differently. From documentation:

This method returns immediately if there are records available. Otherwise, it will await the passed timeout. If the timeout expires, an empty record set will be returned

To fetch one per minute you will need to poll with short duration and than wait 60s (not accurate) or, use some other tools to poll in 60s interval.

ulfryk
  • 683
  • 1
  • 6
  • 15
  • 2
    please "max.poll.records" note that I cannot use the method https://github.com/confluentinc/confluent-kafka-dotnet/issues/1451 – Gülsen Keskin Jan 07 '22 at 12:01
  • @GülsenKeskin - so you can just ignore this part as in C# client you get messages 1 by 1 without any additional effort :) - you only need to sort out interva and poll duration. – ulfryk Jan 07 '22 at 12:21
0

Depending on which api you are using the consumer.poll() method accepts an argument for duration in milliseconds.

Example code in Java:

ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(60000));

More information in docs : https://kafka.apache.org/26/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html

Edit: After code has been added.

the consumer.consume method also accepts arguments for duration.ms

Supporting documentation: https://docs.confluent.io/5.0.0/clients/confluent-kafka-dotnet/api/Confluent.Kafka.Consumer.html#Confluent_Kafka_Consumer_Consume_Confluent_Kafka_Message__System_Int32_

pm.
  • 53
  • 6
  • I am using confluent kafka and as far as I know consumer does not have a poll method – Gülsen Keskin Jan 07 '22 at 11:33
  • 2
    confluent kafka is just a distribution of kafka. what are you using to connect to the instance of kafka? python/java or just a cli consumer? – pm. Jan 07 '22 at 11:42
  • 1
    I am using c# .. – Gülsen Keskin Jan 07 '22 at 11:43
  • 1
    The C# consumer.poll method also accepts duration as an argument. public void Poll(TimeSpan timeout) https://docs.confluent.io/5.0.0/clients/confluent-kafka-dotnet/api/Confluent.Kafka.Consumer.html#Confluent_Kafka_Consumer_Poll_System_Int32_ – pm. Jan 07 '22 at 11:47
  • Thank you. So can you show me an example of how to use it? – Gülsen Keskin Jan 07 '22 at 12:29