We are facing a weird issue, where a Quarkus application stops working after a few days with Azure event hubs.
We have an application, which consumes data from one event hub, does some processing and publishes the processed message to another event hub. This application have been working fine with Kafka, and now due to some internal decisions we are trying to move it to event hubs.
The application is built using Quarkus, Small rye messaging, and uses Kubernetes for deployment. When the application is deployed, everything works fine, for 4-5 days and then suddenly we start getting readiness check issues, and the app stops working.
Example message: Channel down. Event-hub-1 : OK, Event-Hub-2 KO.
And that's it, the entire application stops working. Interestingly, this doesn't happen immediately and takes a few days to happen, so we're also not able to replicate and debug it easily.
Once the app is started, it's able to connect to event hub again, as if nothing was wrong, and continues working for a few more days.
If anyone has any ideas why this might be happening, it would be very helpful. If more information is required, do let me know.
To reproduce the issue, we've tried to disable an event hub, which gives us proper error messages. We also deleted entire event hub namespace, and got error messages in that case too. But when the actual issue happens, we see no errors. Just the readiness check failed message.