I am using the azure IOT sdk in an ESP32-based device to connect to the IOT hub using MQTT, sending messages with QOS 1. When the connection is good, all works exactly as intended. However, when we deploy to areas where the connectivity seems somewhat more spotty, the messages often time out (i.e. callback is called with the timeout error). The MQTT still thinks it has a connection (i.e. the disconnect callback has not been called), but all sends end up timing out. Interestingly, I see that when I send c2d messages, they do get picked up. I have configured the firmware to tear down and rebuild the MQTT connection in these scenarios and that sometimes helps but not always.
Two questions:
Why does this seem to happen, and are there parameters that I can twiddle to prevent it. I have reduced the size of the packets but that did not seem to make a difference.
What is the appropriate way of handling this condition? I have seen scenarios where once the communication gets "stuck" like this, it can stay stuck for tens of minutes.
Hope there's someone from MSFT IOT group listening... :)