0

I am using AutoLockRenew class register method after receiving message from azure service bus topic subscription using azure sdk for python.

Even if you use AutoLockRenew.register() method in python which takes care of automatically renewing the lock for you - in case the renewable.renew_lock() throws an exception if message lock expired, it fails silently!

The parent thread which is still executing business logic keeps on executing (until it calls message.complete() when actually it would know about this) but the same time, the same message appears in the queue and second instance takes it up for processing! This mean the same message is now being processed simultaneously by 2 different receivers!

What is the recommended way of resolving this?

The code looks like below -

auto_renewer = AutoLockRenew()
with sub_client.get_receiver() as receiver:
    for message in receiver:
        auto_renewer.register(message)
        ....
Deepak Agarwal
  • 458
  • 1
  • 4
  • 18

1 Answers1

0

(For introductions, I'm one of the devs for the python ServiceBus SDK. Would have left this as a comment but apparently I don't have enough reputation here.)

A few thoughts:

  1. There is a known bug in version 0.50.2 and prior where if message processing takes over 10 minutes, the service would terminate the associated link and the message not be settleable. (and in fact would be picked up by another receiver as you describe.) Simply updating to 0.50.3 (Released earlier this week) would address this, as you appear to be properly using autorenew, which has been bolstered to keep the associated link alive.
  2. The autorenew failure is somewhat quiet, since it's performing its work in a separate thread, but it does report this up via the inner_exception property on the exception thrown when you attempt to complete(). If you wanted to observe this sooner, you could check the auto_renew_error field on the message. I would be curious to know what this value is being set as to better triangulate if what you're seeing is due to point #1 above, or something else.
  3. If the above isn't helpful, in this sort of scenario it can be very useful to see the deeper logging output. This can be enabled by first setting up loggers for the "azure" and "uamqp" names (to view logs from the two python components in the stack) and setting debug=True when creating your ServiceBusClient (to view logs from the underlying C messaging lib, assuming you're using version 0.50.3 or prior, the name of that property changed with version 7.0.0b1).

A note for the future that if you wanted to reach out to us, feel free to leverage github.com/Azure/azure-sdk-for-python to make issues, as those show up very front-and-center for us.

Kibrantn
  • 609
  • 3
  • 4
  • Thanks a lot for your response. However, my question is somewhat still unanswered. I wonder why the silent failure is by design? Doesn't it defeat the purpose of locking the message concept wherein the other subscriber should not work on the same message? I can check the inner_exception or auto_renew_error field in my main thread but how frequent - It completely takes away the benefit of using AutoLockRenew in first place isn't As suggested, raise it here - https://github.com/Azure/azure-sdk-for-python/issues/11611 – Deepak Agarwal May 23 '20 at 12:23
  • You're correct as to the purpose of locking, the subtlety here is that the lock _has_ been lost, but only raised in the renewer thread. That is by design, and somewhat constrained by the language/call pattern; we're limited in how we can expose an exception from a parallel thread. Unfortunately autorenew can never be truly bulletproof (e.g. network cut) so there's always some need to handle a failed settlement, or preemptively exit on auto_renew_error. Primarily, it exists to simplify users writing their own keep-alive-loop. I've echoed my response (with verbosity) in github for convenience. – Kibrantn May 26 '20 at 21:27
  • Thanks @Kibrantn. I've moved to the github now and provided my comments. – Deepak Agarwal May 27 '20 at 04:37