0

A Java application sends or publishes messages to Kafka cluster. Code to send records is given below. As this application is sending huge volumes of record messages to kafka, it is a bit difficult to debug. But I see some of the messages are not been sent out to kafka, because I do not see some of them in kafka broker subscriber for that same topic. So looks like some data leak is there in this pipe.

For this I have added java.util.concurrent.Future.isDone() method to see the response. And it respond back as false. So I am confused as where exactly leakage exist. is it failing to send record to kafka? or in Kafka broker it is failing to process the record before placing it in topic?

// setting properties from config xml..
propertiess.put("security.protocol", properties.getProperty("security_protocol"));
properties.put("acks", properties.getProperty("kafka_acks"));

...
...
//Producer initialization
Producer<String, String> producer = new KafkaProducer<>(kprops);
    ...
    ...

    ProducerRecord<String, String> producerRecord = new ProducerRecord<>(topicName, key, newValue);
    Future<RecordMetadata> kafkaResponse = producer.send(producerRecord);
    String kafkaSuccessStatus = kafkaResponse.isDone() ? "Sending message to kafka completed" : "Sending message to kafka not completed";
    LOGGER.debug(kafkaSuccessStatus);

I am using isDone() to check the event success. Is it right way to do? If not is there any other way can we get this response with more information. As I do not see any errors or information in logs it is becoming hard to find out what exactly happening and where data is dropped. I am using kafka-0.11 version.

Info on the application: This app needs to process very huge volumes(in Billions) of records daily and has to publish to Kafka broker. This is a multi threaded application read lines from a file and send it to kafka.

Nomad
  • 751
  • 4
  • 13
  • 34
  • 1
    That checks that they are prepared to send in the next batch, I believe. You could add `producer.flush()` and/or `producer.close()` on the Runtime shutdown hook – OneCricketeer Sep 19 '19 at 19:46

1 Answers1

-1

Standard Kafka client batches messages before actually sending them. It works like a client-side cache. If your application shuts down the client without flushing the buffer (and waiting until it's actually flushed) you are at risk of loosing messages. This is the price you pay for performance.

If you value reliability more than performance you might consider using old-fashioned queue instead.

Aleksey
  • 252
  • 2
  • 10