There are 2 things to point here. One is that if you have a serious risk of your lighthouse node going down, you probably should have more that one -
akka.cluster.seed-nodes
setting can take multiple addresses, the only requirement here is that all nodes, including lighthouses, must have them specified in the same order. This way if one lighthouse is going down, another one still can take its role.
Other thing is that when a node becomes unreachable (either because the process crashed on network connection is unavailable), by default akka.net cluster won't down that node. You need to tell it, how it should behave, when such thing happens:
- At any point you can configure your own
IDowningProvider
interface, that will be triggered after certain period of node inactivity will be reached. Then you can manually decide what to do. To use it add fully qualified type name to followin setting: akka.cluster.downing-provider = "MyNamespace.MyDowningProvider, MyAssembly"
. Example downing provider implementation can be seen here.
- You can specify
akka.cluster.auto-down-unreachable-after = 10s
(or other time value) to specify some timeout given for an unreachable node to join - if it won't join before the timeout triggers, it will be kicked out from the cluster. Only risk here is when cluster split brain happens: under certain situations a network failure between machines can split your cluster in two, if that happens with auto-down set up, two halves of the cluster may consider each other dead. In this case you could end up having two separate clusters instead of one.
- Starting from the next release (Akka.Cluster 1.3.3) a new Split Brain Resolver feature will be available. It will allow you to configure more advanced strategies on how to behave in case of network partitions and machine crashes.