4

I am running locally an app (Spring Boot & Spring Cloud Stream & Kafka Binder & Kafka Streams Binder) that consumes from and produces to several topics from which all have 4 partitions. I have some stateful stream processing inside using KafkaStreams. Everything seems fine after starting the app in several instances but the moment that one instance goes down others keep throwing FileNotFoundException continuously.

When running 3 instances, just after starting I see:

  • For the first one
    current active tasks: [0_0, 1_0]
    current standby tasks: []
    previous active tasks: [0_0, 1_0, 0_1, 1_1]
  • For the second one
    current active tasks: [0_2, 1_2, 0_3]
    current standby tasks: []
    previous active tasks: [0_2, 1_2, 0_3, 1_3]
  • And for the last one:
    current active tasks: [0_1, 1_1, 1_3]
    current standby tasks: []
    previous active tasks: []

I publish some messages to topics that are processed by kafka streams and everything works fine. But when I shut down the first instance then the second one throws continuously

java.io.FileNotFoundException: /tmp/kafka-streams/my-service/1_2/.checkpoint.tmp (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method) ~[na:1.8.0_144]
    at java.io.FileOutputStream.open(FileOutputStream.java:270) ~[na:1.8.0_144]
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213) ~[na:1.8.0_144]
    at java.io.FileOutputStream.<init>(FileOutputStream.java:162) ~[na:1.8.0_144]
    at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:78) ~[kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315) ~[kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:397) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:382) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1042) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767) [kafka-streams-2.0.1.jar:na]
    at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736) [kafka-streams-2.0.1.jar:na]

and the third one does the same only the directory is once:

/tmp/kafka-streams/my-service/1_1/.checkpoint.tmp

and once:

/tmp/kafka-streams/my-service/1_3/.checkpoint.tmp

In tmp/kafka-streams/my-service I can see that in fact only directories 0_0 0_1 0_2 0_3 1_0 keep existing after shutting down first instance.

It does not crash the app and from what I see the state is accessible from running instances (but it is possible that I'm missing something here). Does anyone know why this exception is thrown, what impact it may have and what should I change to fix it?

redfox
  • 178
  • 1
  • 9

0 Answers0