3

I'm using Reactor Kafka in a Spring Boot Reactive app, with Spring Cloud Sleuth for distributed tracing. I've setup Sleuth to use a custom propagation key from a header named "traceId". I've also customized the log format to print the header in my logs, so a request like

curl -H "traceId: 123456" -X POST http://localhost:8084/parallel

will print 123456 in every log anywhere downstream starting from the Controller.

I would now like this header to be propagated via Kafka too. I understand that Sleuth has built-in instrumentation for Kafka too, so the header should be propagated automatically, however I'm unable to get this to work.

From my Controller, I produce a message onto a Kafka topic, and then have another Kafka consumer pick it up for processing.

Here's my Controller:

@RestController
@RequestMapping("/parallel")
public class BasicController {

    private Logger logger = Loggers.getLogger(BasicController.class);

    KafkaProducerLoadGenerator generator = new KafkaProducerLoadGenerator();    

    @PostMapping
    public Mono<ResponseEntity> createMessage() {
        int data = (int)(Math.random()*100000);
        return Flux.just(data)
                .doOnNext(num -> logger.info("Generating document for {}", num))
                .map(generator::generateDocument)
                .flatMap(generator::sendMessage)
                .doOnNext(result ->
                        logger.info("Sent message {}, offset is {} to partition {}",
                                result.getT2().correlationMetadata(),
                                result.getT2().recordMetadata().offset(),
                                result.getT2().recordMetadata().partition()))
                .doOnError(error -> logger.error("Error in subscribe while sending message", error))
                .single()
                .map(tuple -> ResponseEntity.status(HttpStatus.OK).body(tuple.getT1()));

    }
}

Here's the code that produces messages on to the Kafka topic

@Component
public class KafkaProducerLoadGenerator {

    private static final Logger logger = Loggers.getLogger(KafkaProducerLoadGenerator.class);

    private static final String bootstrapServers = "localhost:9092";
    private static final String TOPIC = "load-topic";

    private KafkaSender<Integer, String> sender;

    private static int documentIndex = 0;

    public KafkaProducerLoadGenerator() {
        this(bootstrapServers);
    }

    public KafkaProducerLoadGenerator(String bootstrapServers) {
        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        props.put(ProducerConfig.CLIENT_ID_CONFIG, "load-generator");
        props.put(ProducerConfig.ACKS_CONFIG, "all");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, IntegerSerializer.class);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);

        SenderOptions<Integer, String> senderOptions = SenderOptions.create(props);
        sender = KafkaSender.create(senderOptions);
    }

    @NewSpan("generator.sendMessage")
    public Flux<Tuple2<DataDocument, SenderResult<Integer>>> sendMessage(DataDocument document) {
        return sendMessage(TOPIC, document)
                .map(result -> Tuples.of(document, result));
    }

    public Flux<SenderResult<Integer>> sendMessage(String topic, DataDocument document) {
        ProducerRecord<Integer, String> producerRecord = new ProducerRecord<>(topic, document.getData(), document.toString());
        return sender.send(Mono.just(SenderRecord.create(producerRecord, document.getData())))
                .doOnNext(record -> logger.info("Sent message to partition={}, offset={} ", record.recordMetadata().partition(), record.recordMetadata().offset()))
                .doOnError(e -> logger.error("Error sending message " + documentIndex, e));
    }

    public DataDocument generateDocument(int data) {
        return DataDocument.builder()
                .header("Load Data")
                .data(data)
                .traceId("trace"+data)
                .timestamp(Instant.now())
                .build();
    }
}

My consumer looks like this:

@Component
@Scope(scopeName = ConfigurableBeanFactory.SCOPE_PROTOTYPE)
public class IndividualConsumer {

    private static final Logger logger = Loggers.getLogger(IndividualConsumer.class);

    private static final String bootstrapServers = "localhost:9092";
    private static final String TOPIC = "load-topic";

    private int consumerIndex = 0;

    public ReceiverOptions setupConfig(String bootstrapServers) {
        Map<String, Object> properties = new HashMap<>();
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
        properties.put(ConsumerConfig.CLIENT_ID_CONFIG, "load-topic-consumer-"+consumerIndex);
        properties.put(ConsumerConfig.GROUP_ID_CONFIG, "load-topic-multi-consumer-2");
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, IntegerDeserializer.class);
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, DataDocumentDeserializer.class);
        return ReceiverOptions.create(properties);
    }

    public void setIndex(int i) {
        consumerIndex = i;
    }

    @EventListener(ApplicationReadyEvent.class)
    public Disposable consumeMessage() {
        ReceiverOptions<Integer, DataDocument> receiverOptions = setupConfig(bootstrapServers)
                .subscription(Collections.singleton(TOPIC))
                .addAssignListener(receiverPartitions -> logger.debug("onPartitionsAssigned {}", receiverPartitions))
                .addRevokeListener(receiverPartitions -> logger.debug("onPartitionsRevoked {}", receiverPartitions));

        Flux<ReceiverRecord<Integer, DataDocument>> messages = Flux.defer(() -> {
            KafkaReceiver<Integer, DataDocument> receiver = KafkaReceiver.create(receiverOptions);
            return receiver.receive();
        });
        Consumer<? super ReceiverRecord<Integer, DataDocument>> acknowledgeOffset = record -> record.receiverOffset().acknowledge();
        return messages
                .publishOn(Schedulers.newSingle("Parallel-Consumer"))
                .doOnError(error -> logger.error("Error in the reactive chain", error))
                .delayElements(Duration.ofMillis(100))
                .doOnNext(record -> {
                    logger.info("Consumer {}: Received from partition {}, offset {}, data with index {}",
                            consumerIndex,
                            record.receiverOffset().topicPartition(),
                            record.receiverOffset().offset(),
                            record.value().getData());
                })
                .doOnNext(acknowledgeOffset)
                .doOnError(error -> logger.error("Error receiving record", error))
                .retryBackoff(100, Duration.ofSeconds(5), Duration.ofMinutes(5))
                .subscribe();
    }
}

I would expect Sleuth to automatically carry over the built-in Brave trace and the custom headers to the consumer, so that the trace covers the entire transaction.

However I have two problems.

  1. The generator bean doesn't get the same trace as the one in the Controller. It uses a different (and new) trace for every message sent.
  2. The trace isn't propagated from Kafka producer to Kafka consumer.

I can resolve #1 above by replacing the generator bean with a simple Java class and instantiating it in the controller. However that means I can't autowire other dependencies, and in any case it doesn't solve #2.

I am able to load an instance of the bean brave.kafka.clients.KafkaTracing so I know it's being loaded by Spring. However, it doesn't look the instrumentation is working. I inspected the content on Kafka using Kafka Tool, and no headers are populated on any message. In fact the consumer doesn't have a trace at all.

2020-05-06 23:57:32.898     INFO  parallel-consumer:local [123-21922,578c510e23567aec,578c510e23567aec] 8180 --- [reactor-http-nio-3] rja.parallelconsumers.BasicController    : Generating document for 23965
2020-05-06 23:57:32.907     INFO  parallel-consumer:local [52e02d36b59c5acd,52e02d36b59c5acd,52e02d36b59c5acd] 8180 --- [single-11] r.p.kafka.KafkaProducerLoadGenerator     : Sent message to partition=17, offset=0 
2020-05-06 23:57:32.908     INFO  parallel-consumer:local [123-21922,578c510e23567aec,578c510e23567aec] 8180 --- [single-11] rja.parallelconsumers.BasicController    : Sent message 23965, offset is 0 to partition 17
2020-05-06 23:57:33.012     INFO  parallel-consumer:local [-,-,-] 8180 --- [parallel-5] r.parallelconsumers.IndividualConsumer   : Consumer 8: Received from partition load-topic-17, offset 0, data with index 23965

In the log above, [123-21922,578c510e23567aec,578c510e23567aec] is [custom-trace-header, brave traceId, brave spanId]

What am I missing?

Rajesh J Advani
  • 5,585
  • 2
  • 23
  • 35
  • I am trying to achieve similar behaviour. Were you able to get it working and how? – Amardeep Nov 19 '20 at 12:11
  • @amardeep Unfortunately not. – Rajesh J Advani Feb 03 '21 at 06:48
  • Look at the comments under https://stackoverflow.com/a/53184960/837785 : it seems that Spring Cloud Stream does that kind of propagation. – Крис Feb 05 '21 at 06:29
  • It may be related to this https://github.com/spring-cloud/spring-cloud-sleuth/issues/1815 I lost all my traceId information in the logs likely because SLF4J/Logback MDCs are ThreadLocals which may cause some weirdness logging in reactive contexts – Archimedes Trajano Feb 16 '21 at 04:23

0 Answers0