i have 2 PCs that I run the following commands on both gfsh terminals:
start locator --name=locator1 --locators=ipaddress1[10334], ipaddress2[10334] start server --name=server1 --locators=ipaddress1[10334], ipaddress2[10334]
after they start, i am able to see all 4 members on both terminals when i list members.
NOW:
Say i run these commands on PC1 first, then PC2 second. (so PC1 is the first online).
If i shutdown PC2, to simulate a PC failure, PC1 is ok. when i list members, it has 2 (locator and server).
I bring up PC2 and run the commands again and everything is good with 4 members again.
HOWEVER, if i shutdown the PC1 (being the first PC in the original cluster startup), PC2 drops connection with everything shortly after(about 5 seconds). gfsh connection is dropped and I am unable to connect to local host at all, but the process ids for the server and locator are still running.
It says in the LOG(s) Membership Service Failure: Exiting due to possible network partition event due to loss of 2 cache processes.
When I bring PC1 back online and run the locator and server commands, then i can connect again on PC2.
Can anyone help me with this??? I am having a really hard time trying to figure out what is happening here.