1

I have an issue with a IoT device looses the connection to a transparent Azure IoT Edge gateway. I don't know where to start searching, therefore I'm a bit lost here.

IoT Device I used the sample telemetry application (Python) and customized it to our needs. It connects to the Edge Device with MQTT over WS. Initially, it works great until the disconnect happens. SDK version is 2.11.0

IoT Edge I have setup an Azure IoT Edge device as transparent gateway. It is running the latest versions (1.2), installed on a Azure Linux VM.

Problem

When the script has been running for some time (e.g. 30 minutes) a connectivity issue appears.

Exception caught in background thread.  Unable to handle.
ReconnectStage: DisconnectEvent received while in unexpected state - CONNECTING, Connected: False
['azure.iot.device.common.pipeline.pipeline_exceptions.OperationTimeout: Transport timeout on connection operation\n']
Traceback (most recent call last):
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\site-packages\azure\iot\device\iothub\aio\async_clients.py", line 33, in handle_result
    return await callback.completion()
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\site-packages\azure\iot\device\common\async_adapter.py", line 91, in completion
    return await self.future
azure.iot.device.common.transport_exceptions.ConnectionDroppedError: transport disconnected

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Development\machine-poc\python\telemetrysender.py", line 165, in <module>
    main()
  File "c:\Development\machine-poc\python\telemetrysender.py", line 76, in main
    send_telemetry_from_device(device_client, payload, i)
  File "c:\Development\machine-poc\python\telemetrysender.py", line 86, in send_telemetry_from_device
    asyncio.run(device_client.send_message(msg))
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\asyncio\runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 641, in run_until_complete
    return future.result()
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\site-packages\azure\iot\device\aio\patch_documentation.py", line 60, in send_message
    return await super(IoTHubDeviceClient, self).send_message(message)
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\site-packages\azure\iot\device\iothub\aio\async_clients.py", line 354, in send_message
    await handle_result(callback)
  File "C:\Users\foo\AppData\Local\Programs\Python\Python310\lib\site-packages\azure\iot\device\iothub\aio\async_clients.py", line 35, in handle_result
    raise exceptions.ConnectionDroppedError("Lost connection to IoTHub") from e
azure.iot.device.exceptions.ConnectionDroppedError: Lost connection to IoTHub
2022-04-06 12:38:25.903174: Closing connection to IoT Hub

The latest telemetry messages that arrives at the IoT Hub at 2022-04-06T12:37:24.510988

The log of the edgeHub shows the following info:

<6> 2022-04-06 12:37:26.023 +00:00 [INF] - Closing connection for device: python-test-device, ,
<6> 2022-04-06 12:37:26.025 +00:00 [INF] - Disposing MessagingServiceClient for device Id python-test-device because of exception -
<6> 2022-04-06 12:37:26.032 +00:00 [INF] - Setting device proxy inactive for device Id python-test-device
<6> 2022-04-06 12:37:26.034 +00:00 [INF] - Removing device connection for device python-test-device with removeCloudConnection flag 'True'.

After such an incident, restarting the script fails with the following error:

ReconnectStage: DisconnectEvent received while in unexpected state - DISCONNECTED, Connected: False
Exception caught in background thread.  Unable to handle.
['azure.iot.device.common.transport_exceptions.ConnectionDroppedError: Unexpected disconnection\n']
Error while connecting: Could not complete operation before timeout

I can only reconnect the device after restarting the edgeHub module on the edge device.

Question Are there any other logs that could help finding the root cause? I looks like the error is in the Edge runtime. So why does it crash? And why does the SDK fail to reconnect?

Thanks for any help!

Ecstasy
  • 1,866
  • 1
  • 9
  • 17
Peb
  • 123
  • 1
  • 8
  • You can refer to [Test and Troubleshoot the gateway connection](https://learn.microsoft.com/en-us/azure/iot-edge/how-to-connect-downstream-device?view=iotedge-2020-11#test-the-gateway-connection), [Unable to Connect Send Message From Downstream Device To Pratent Edge Device](https://github.com/MicrosoftDocs/azure-docs/issues/82009) and [Transparent gateway pattern device provisioning](https://github.com/MicrosoftDocs/azure-docs/issues/17936) – Ecstasy Apr 07 '22 at 05:44
  • Thanks, I read those pages but they don't help. The document I found so far help with setting up the connection initially. In my case it works, but stops working after some minutes without giving a hint what's wrong. – Peb Apr 07 '22 at 09:56
  • @Peb I believe I found a similar issue to yours in the SDK repo. Can you reply to the thread with your [support-bundle logs](https://learn.microsoft.com/en-us/azure/iot-edge/troubleshoot?view=iotedge-2020-11#gather-debug-information-with-support-bundle-command)? It will help troubleshoot: https://github.com/Azure/iotedge/issues/6006 . As a workaround you can try using IoT Edge Version 1.1 – asergaz May 12 '22 at 10:11
  • 1
    @asergaz Yes, that's already planned. The github issue I'm working on is https://github.com/Azure/azure-iot-sdk-python/issues/993 – Peb May 16 '22 at 08:03

0 Answers0