0

I've some flink jobs which uses kafka as source and sink and I want to add tracing to it, so that any message consumed/produced from/to Kafka is well traced, for that I'm using kafka interceptors to intercepts messages and log trace, span and parent traceId, for that I'm using opentracing-kafka-client(v0.1.11) in conjunction with brave-opentracing(v0.35.1), the reason why I'm using custom interceptors because I need to log messages in a specified format.

After configuring interceptors they are getting invoked and it uses tracing information (from headers) coming from upstream system and logs it but when it comes to producing message again to kafka then tracing context is lost for instance consider below scenario

1) Message put on Kafka by some rest service 2) Message consumed by flink job and interceptors kicks in and uses tracing information from header and logs it 3) After processing message is produced by flink job to Kafka

It works well until step #2 but when it comes to producing message then tracing information from previous step is not used because it does not have any headers information and hence it produces entirely new trace.

I'm registering tracer as below :-

public class MyTracer {

  private static final Tracer INSTANCE = BraveTracer.create(Tracing.newBuilder().build());

  public static void registerTracer() {
    GlobalTracer.registerIfAbsent(INSTANCE);
  }

  public static Tracer getTracer() {
    return INSTANCE;
  }
}

And I'm using TracingConsumerInterceptor and TracingProducerInterceptor from opentracing kafka.

Akhil
  • 1,184
  • 1
  • 18
  • 42
  • Hi ! could you make it work? – Rocel Sep 30 '20 at 01:09
  • 1
    @Rocel Yeah, got it working, the issue was tracing headers was not getting populated in header after message being consumed and processed, so I had to make tracing headers part of message body (i.e. populating headers to message body within Kafka Interceptors), since the message could change during processing so it has to be part of body, there could be optimal solution like using ThreadLocals but using threadLocal is bit tricky when your flink job is performing some batching operations. – Akhil Sep 30 '20 at 08:22

0 Answers0