This application uses Spring Cloud Stream to create a GlobalKTable and is deployed on Kubernetes as a StatefulSet, mounting a volume for the rocksdb content using a persistence volume claim for a Google Cloud Regional Persistent disk:
apiVersion: apps/v1
kind: StatefulSet
metadata:
...
spec:
...
template:
metadata:
...
spec:
containers:
- env:
...
volumeMounts:
- mountPath: /tmp/kafka-streams
name: rocksdb
volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
name: rocksdb
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: regionalpd-storageclass
volumeMode: Filesystem
status:
phase: Pending
It is working fine, but sometimes after updating the image to a new application version the new container doesn't manage to start, it gets stuck until if finally throws the following exception:
org.springframework.context.ApplicationContextException: Failed to start bean 'streamsBuilderFactoryManager'; nested exception is org.springframework.kafka.KafkaException: Could not start stream: ; nested exception is org.springframework.kafka.KafkaException: Could not start stream: ; nested exception is org.apache.kafka.streams.errors.StreamsException: Exception caught during initialization of GlobalStreamThread
at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:181)
at org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:54)
at org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:356)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:155)
at org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:123)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:935)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
at org.springframework.boot.web.reactive.context.ReactiveWebServerApplicationContext.refresh(ReactiveWebServerApplicationContext.java:64)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754)
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:434)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:338)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1332)
at io.xxx.MyApplicationKt.main(SentrioGatewayApplication.kt:27)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:108)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:58)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:88)
Caused by: org.springframework.kafka.KafkaException: Could not start stream: ; nested exception is org.springframework.kafka.KafkaException: Could not start stream: ; nested exception is org.apache.kafka.streams.errors.StreamsException: Exception caught during initialization of GlobalStreamThread
at org.springframework.cloud.stream.binder.kafka.streams.StreamsBuilderFactoryManager.start(StreamsBuilderFactoryManager.java:96)
at org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:178)
... 22 common frames omitted
Caused by: org.springframework.kafka.KafkaException: Could not start stream: ; nested exception is org.apache.kafka.streams.errors.StreamsException: Exception caught during initialization of GlobalStreamThread
at org.springframework.kafka.config.StreamsBuilderFactoryBean.start(StreamsBuilderFactoryBean.java:333)
at org.springframework.cloud.stream.binder.kafka.streams.StreamsBuilderFactoryManager.start(StreamsBuilderFactoryManager.java:87)
... 23 common frames omitted
Caused by: org.apache.kafka.streams.errors.StreamsException: Exception caught during initialization of GlobalStreamThread
at org.apache.kafka.streams.processor.internals.GlobalStreamThread.initialize(GlobalStreamThread.java:400)
at org.apache.kafka.streams.processor.internals.GlobalStreamThread.run(GlobalStreamThread.java:281)
Caused by: org.apache.kafka.common.errors.TimeoutException: Global task did not make progress to restore state within 300000 ms. Adjust `task.timeout.ms` if needed.
When this happens I have to delete the StatefulSet, the persistent volume claims and create it again (generating a downtime).
Why am I getting this exception? How could I fix it??