0

I am using reactor library to fetch the large stream of data from the network and sending it to the kafka broker using the reactive kafka approach.

Below is the Kafka Producer I am using

public class LogProducer {

    private final KafkaSender<String, String> sender;

    public LogProducer(String bootstrapServers) {

        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ProducerConfig.CLIENT_ID_CONFIG, "log-producer");
        props.put(ProducerConfig.ACKS_CONFIG, "all");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        SenderOptions<String, String> senderOptions = SenderOptions.create(props);

        sender = KafkaSender.create(senderOptions);
    }

    public void sendMessages(String topic, Flux<Logs.Data> records) {
    
        AtomicInteger sentCount = new AtomicInteger(0);
        AtomicInteger fCount = new AtomicInteger(0);
        
        records.doOnNext(r -> fCount.incrementAndGet()).subscribe();
        System.out.println("Total Records: " + fCount);
        
        sender.send(records.doOnNext(r -> sentCount.incrementAndGet())
                .map(record -> {
                    LogRecord lrec = record.getRecords().get(0);
                    String id = lrec.getId();
                    return SenderRecord.create(new ProducerRecord<>(topic, id,
                            lrec.toString()), id);
                })).then()
                .doOnError(e -> {
                    log.error("[FAIL]: Send to the topic: '{}' failed. "
                            + e, topic);
                })
                .doOnSuccess(s -> {
                    log.info("[SUCCESS]: {} records sent to the topic: '{}'", sentCount, topic);
                })
                .subscribe();
    }

}

The total number of records in the Flux (fCount) and the records sent to the Kafka topic (sentCount) does not match, it does not give any error and completes successfully.

For example: In one of the case total number of records in the Flux is 2758, while the count sent to the kafka is 256. Is there any kafka configuration, which needs to be modified, or Am I missing anything?

===========================================================

Updated based on the comments

sender.send(records
        .map(record -> {
            LogRecord lrec = record.getRecords().get(0);
            String id = lrec.getId();
            sleep(5); // sleep for 5 ns
            return SenderRecord.create(new ProducerRecord<>(topic, id,
                    lrec.toString()), id);
        })).then()
        .doOnError(e -> {
            log.error("[FAIL]: Send to the topic: '{}' failed. "
                    + e, topic);
        })
        .doOnSuccess(s -> {
            log.info("[SUCCESS]: {} records sent to the topic: '{}'", sentCount, topic);
        })
        .subscribe();
sleep(10); // sleep for 10 ns

The above code worked fine in one system but failed to send all messages in other.

Nishit Jain
  • 1,549
  • 8
  • 21
  • 33
  • How does it look if you remove that first ` records.doOnNext(r -> fCount.incrementAndGet()).subscribe();` ? How do you call that `sendMessages()` method? Do wait enough time to process all the messages? The `subscribe()` is async and there is no guarantee that your main thread is going to be blocked until you finish process over there. – Artem Bilan Feb 27 '20 at 16:49
  • The first subscribe statement is to count the number of records for testing purpose, removing it does not make any changes to the message sent to the kafka. `sendMessages()` is getting called from a task executed by ScheduledExecutorService at fixedDelay `service.scheduleWithFixedDelay(scheduledLogsSubmit, 0, 5, TimeUnit.MINUTES);` – Nishit Jain Feb 28 '20 at 05:56
  • @ArtemBilan any clue on what could be the problem?? – Nishit Jain Mar 03 '20 at 09:38
  • 1
    You exit from program too early. Be sure that main thread is blocked long enough – Artem Bilan Mar 03 '20 at 13:29
  • Thanks Artem, your comment was helpful, yes that was the problem , I had added a 10 ns sleep in the map() operator. now it is putting all the messages to the kafka topic – Nishit Jain Apr 11 '20 at 06:28
  • @ArtemBilan I faced the same issue again as the wait time may be system dependent. e.g. 50 ns was enough for sending in one system, but it may not in other depending on the resources. Is there any way to guarantee that the sending of all the messages are finished? – Nishit Jain Sep 24 '20 at 09:07
  • Updated the post with the changes – Nishit Jain Sep 24 '20 at 09:13
  • @ArtemBilan I have asked a related separate question https://stackoverflow.com/questions/66046392, will you be able to help me with that – Nishit Jain Feb 04 '21 at 13:15

0 Answers0