There are two important properties worth to remember when it comes to cluster state:
- There is no such thing as global cluster state. Each node has its own view on how cluster looks at the moment. This is natural effect of having peer-to-peer approach (so there is no single master node, that can set an arbitrary state).
- Cluster state is not updated immediately. All states are build to reach eventual convergence, but some time may pass before current cluster node status will be gossiped to others.
The simplest for of the cluster state at the moment is Cluster.Get(Context.System).State
which contains info about currently known members as well as unreachable nodes.
Another way is to cluster.Subscribe(Self, typeof(ClusterEvent.IMemberEvent), typeof(ClusterEvent.IReachabilityEvent))
(which have to be unsubscribed when actor dies). This way you can get notified about cluster state changes as they come up.
Regarding long time required to acknowledge removed node. Depending of what one would mean by "long", this may be a sign of some bug. In that case it would be great, if you could set an issue with repro steps.