0

I am spawning several ASF microservices to run some process. Once the process is done, I am deleting those services using DeleteServiceAsync by using following code. Almost 98% of the time, everything works fine. However, 2% of the time, I run into timeout issue and the microservices stucks in deleting state with Idle Secondary replica. Thanks in advance for any suggestions to resolve this issue.

using (FabricClient fc = new FabricClient())
{
    fc.ServiceManager.DeleteServiceAsync(deleteServiceDescription, TimeSpan.FromMinutes(5), cancellationToken);
}
antar
  • 510
  • 1
  • 3
  • 16
  • 1
    do you have a code path in RunAsync that could take a long time and doesn't regularly check the cancellationtoken? – LoekD Mar 13 '17 at 08:22
  • 1
    The cancellationtoken passed to RunAsync is not used. RunAsync starts another task that could take long time. Do I need to pass this cancellation token to new task that I start within RunAsync so that this can be terminated all tasks when cancellation is requested? – antar Mar 13 '17 at 16:17
  • 1
    Yes, pass it to the operation and check it. – LoekD Mar 13 '17 at 20:40

1 Answers1

1

Well, you could force drop the offending replicas with something like the solution provided to this question, but that's usually a bad idea and shouldn't be done in production.

This stuck state usually indicates that the service is having a problem in its shutdown path. Have you debugged this locally? Just creating and deleting the service in a loop until it happens should be enough to show you where it is.

Community
  • 1
  • 1
masnider
  • 2,609
  • 13
  • 20