1

I have custom workers as Cloud services(classic) on Azure. I have scheduled autoscaling of amount of instances for business hours (5) and others (20).

Sometimes there is a problem when some workers are working with commands but it is the beginning of business hours, so instances are "killed" (scale down). As result commands aren't finished. Commands are messages from Azure queues. So basically that messages are returned to the queue. But it takes some time for another worker to grab that message again.

So the question - is there an automatic way of scaling workers only after they finished working with message/command?

demo
  • 6,038
  • 19
  • 75
  • 149
  • Generally speaking it is not recommended but if I recall correctly you can prevent a role from shutting down by returning `false` from `OnStop` method. You could give that a try and return false if the message/command is being processed. – Gaurav Mantri Mar 05 '20 at 14:11
  • @GauravMantri, seems `OnStop` returns `void`. `OnStart` can return `bool` value – demo Mar 05 '20 at 14:35

2 Answers2

1

See the answers at How to Stop single Instance/VM of WebRole/WorkerRole.

In short, you can use the Delete Role Instances API (https://learn.microsoft.com/en-us/previous-versions/azure/reference/dn469418(v=azure.100)) to specify which role instances you want to shut down. However, this doesn't work with autoscale, and you would have to write some code to determine which instances you want to have shut down.

Generally the better solution is to ensure the work that the worker role instances are doing is idempotent and can be easily resumed. It sounds like you already have that design but there are issues with the amount of time it takes, in which case you may have opportunities to break each 'command' into smaller units of work so that progress can be maintained if the processing of that command is picked up later by a new instance.

kwill
  • 10,867
  • 1
  • 28
  • 26
1

Your role's OnStop will be called when it's getting torn down. Per the docs:

Once the OnStop method has finished executing, the role will be stopped. If other code requires time to exit gracefully you should keep the OnStop thread busy until execution is complete.

You can take advantage of this 5 minute grace period to drain active work from the role instance. If you share a cancellation token throughout your tasks, e.g., you can cancel in-flight work during OnStop and handle the cancellation to promptly make the work visible on the queue again instead of waiting for the timeout.

Greg D
  • 43,259
  • 14
  • 84
  • 117