We are using Apache Pulsar 2.11 and seeing scenarios that round-robin approach is not working correctly - keeping some consumers idle while there is enough backlog.
I have good understanding of what goes here in Apache Pulsar based no good documentation: https://pulsar.apache.org/docs/2.11.x/developing-binary-protocol/#command-flow
But I am looking for key metrics to track easily: CommandFlow, CommandAck, CommandAckResponse, CommandMessage.
If I can dig into these metrics or some log after the fact, it helps me to troubleshoot and understand better on reason for idle consumers much better.
If any one tried these metrics either directly or through logs or Grafana integration, appreciate any pointers?
Thanks
Few pointers to get more in-depth metrics in Broker/Consumer communication of Apache Pulsar