2

in my Kafka Streams application, I have a task that sets up a scheduled (by the wall time) punctuator. The punctuator iterates over the entries of a store and does something with them. Like this:

var store = context().getStateStore("MyStore");
var iter = store.all();

while (iter.hasNext()) {
   var entry = iter.next();
   // ... do something with the entry
}

// Print a summary (now): N entries processed
// Print a summary (wish): N entries processed in partition P

Since I'm working with a single store here (which might be partitioned), I assume that every single execution of the punctuator is bound to a single partition of that store.

Is it possible to find out which partition the punctuator operates on? The java docs for ProcessorContext.partition() states that this method returns -1 within punctuators.

I've read Kafka Streams: Punctuate vs Process and the answers there. I can understand that a task is, in general, not tied to a particular partition. But an iterator should be tied IMO.

How can I find out the partition?

Or is my assumption that a particular instance of a store iterator is tied to a partion wrong?

What I need it for: I'd like to include the partition number in some log messages. For now, I have several nearly identical log messages stating that the punctuator does this and that. In order to make those messages "unique" I'd like to include the partition number into them.

fml2
  • 190
  • 11

1 Answers1

1

Just to post here the answer that was provided in https://issues.apache.org/jira/browse/KAFKA-12328:

I just used context.taskId(). It contains the partition number at the end of the value, after the underscore. This was sufficient for me.

fml2
  • 190
  • 11