I am running Mosquitto as a Docker container, version 2.0.14 (image: eclipse-mosquitto:2.0.14
). Intentionally not running 2.0.15 as that one has a current regression that is affecting us.
I have created a bridge to AWS, following the standard documentation provided by Amazon.
My config looks like this:
# Bridged topics
topic root/topic/# out 1
# Setting protocol version explicitly
bridge_protocol_version mqttv311
bridge_insecure false
# Bridge connection name and MQTT client Id, enabling the connection automatically when the broker starts.
cleansession false
clientid bridgeawsiot
start_type automatic
notifications false
log_type all
restart_timeout 10 30
I am testing the behaviour of mosquitto for when there is network interruption. We want to test this on open fields where we anticipate network issues with potential prolonged periods of disconnection (several hours up to a couple of days).
We have enabled persistence of messages as well, this are relevant settings:
max_inflight_bytes 0
max_inflight_messages 0
max_queued_bytes 1073741824
max_queued_messages 100000
persistent_client_expiration 7d
listener 1883
autosave_interval 10
persistence true
persistence_file mosquitto.db
persistence_location /mqtt/data
On the AWS side of things, we have MongoDB ingesting data as time series. We have a stable deterministic approach to collect telemetry, so the number of metrics per minute is stable. I am sharing a graph of what the data ingestion looks like:
The queue in mosquitto seems to just keep growing. It doesn't seem to decrease once connectivity is re-established (I am simulating disconnection by just turning off my wifi). When I see what the $SYS/broker/store/messages/count
topic shows, the numbers mostly increase. When I debug the content in the mosquitto.db
(link1, link2) I don't see much details but I can observe things like this:
DB_CHUNK_MSG_STORE:
Length: 4853
Store ID: 59572
Source Port: 1883
Source MID: 7276
Topic: some/topic/here
QoS: 1
Retain: 1
Payload Length: 4706
Expiry Time: 0
I have observed that eventually some data does comes. The graphs suddenly start to fill up, but very slowly, after hours we may get some data points "from the past".
What I am wondering now is: is mosquitto designed to handle long periods of disconnection? Are we using the right tool for the job here? maybe is just a matter of us having incorrectly configured it, if so, can someone point us in a better direction?