Make .net core service run in multiple machines to make it highly available but do the work by only one node

Question

I have a .Net core application that consists of some background tasks (hosted services) and WEB APIs (which controls and get statuses of those background tasks). Other applications (e.g. clients) communicate with this service through these WEB API endpoints. We want this service to be highly available i.e. if a service crashes then another instance should start doing the work automatically. Also, the client applications should be able to switch to the next service automatically (clients should call the APIs of the new instance, instead of the old one).

The other important requirement is that the task (computation) this service performed in the background can’t be shared between two instances. We have to make sure only one instance does this task at a given time.

What I have done up to now is, I ran two instances of the same service and use a SQL server-based distributed locking mechanism (SqlDistributedLock) to acquire a lock. If a service could acquire a lock then goes and do the operation while the other node waiting to acquire the lock. If one service crashed the next node could be able to acquire the lock. On the client-side, I used Polly based retry mechanism to switch the calling URL to the next node to find the working node.
But this design has an issue, if the node which acquired the lock loses the connectivity to the SQL server then the second service managed to acquire the lock and started doing the work while the first service is also in the middle of doing the same.

I think I need some sought of leader election (seems done it wrongly), Can anyone help me with a better solution for this kind of a problem?

score 0 · Answer 1 · answered Sep 26 '22 at 09:34

This problem is not specific to .Net or any other framework. So please make your question more general so as to make it more accessible. Generally the solution to this problem lies in the domain of Enterprise Integration Patterns, so consult the references as the status quo may change.

At first sight and based on my own experience developing distributed systems, I suggest two solutions:

use a load balancer or gateway to distribute requests between your service instances.
use a shared message queue broker to put requests in and let each service instance dequeue a request for processing.

Either is fine and I can use both for my own designs.

Make .net core service run in multiple machines to make it highly available but do the work by only one node

1 Answers1