Activation of actors fails on premise cluster

Question

We have some long running jobs created as Service Fabric actors. The actors have no data other than the reminder. When these services gets deployed in local cluster, they seem to activate with no issues. When we deploy them to server which runs a 3 node cluster some of the services fail to activate. We don't see the memory utilization in node going beyond 50% . However when we added 2 more nodes and ran on 5 node the activation seems to be working fine. We are using 1 partition and 1 replica count only; so wondering is there some setting that is stopping the fabric to activate more services. We have also increased the application port range, but no luck.

It is also noticed that after one service activation fails; other statefull services also becomes unstable. They show error of unhealthy partitions. The cluster also runs some stateless services which runs like a charm. Any clue why the activation fails for the actors?

Are the "3 nodes server" is an on-premise installation of SF Cluster? Can you also share a configuration from `ServiceManifest.xml`? — Oleg Karasik, May 06 '19 at 11:13
Added here. Other than ips the values are as it is. https://1drv.ms/f/s!Ap5h1OBj38ZohHQG4Sibjs5m5bkG — bomaboom, May 07 '19 at 04:49
Here are the files. https://1drv.ms/f/s!Ap5h1OBj38ZohHBDe1cvqguDiOcI Our deployment scripts uses the Local.1Node file to deploy. We don't need the 10 partitions or 3 replicas; because there is no data in it. So we use the reduced values in Local.1Node. Let me know if you find anything wrong. — bomaboom, May 09 '19 at 06:25

Activation of actors fails on premise cluster

0 Answers0