0

I've been messing with this for a bit now and I have managed to crawl through the configuration given the documentation is rather non existent.

Right now the problem is that my ReplicaSet Secondaries cannot get a heartbeat to my Primary. I am able to ping all hosts from each other and I am able to connect to the shell from all hosts.

The ReplicaSet initiated and I was able to add the members, so I know they can all communicate.

Is there something I need to open up on the firewall to get the heartbeats through?

Roman Pokrovskij
  • 9,449
  • 21
  • 87
  • 142
A. Smith
  • 446
  • 7
  • 8
  • Can you open the shell from one host and connect the the server on another host? If no, you would need to allow access to port 27017 on the firewall. – kevinadi Jan 17 '19 at 00:31
  • 2
    Also, what made you determine that the secondaries cannot get a heartbeat to the primary? If you type `rs.status()` and it shows `PRIMARY` and `SECONDARY`, it means that the replica set is working and can communicate with each other. They won't be `SECONDARY` otherwise. – kevinadi Jan 17 '19 at 00:32
  • As I said in my post, I am able to connect to the secondary shells from the primary. I say secondary only hypothetically because they haven't actually initiated yet but they're not the primary. I know the secondaries are missing heartbeat because it tells me in the log. I know they're not initiated because when I look in rs.status() they're still in STARTUP (0). – A. Smith Jan 17 '19 at 14:02
  • I've narrowed it down to a Public Firewall issue. For some reason it is dropping the TCP request from the Secondary even though I have the Firewall rules in place to allow traffic on 27017. – A. Smith Jan 17 '19 at 17:42

1 Answers1

1

The problem was with the inbound Firewall Rule I created for traffic over 27017.

My inbound rule had a typo in the port number, preventing either secondary from contacting the primary.

The outbound rule was fine, which made it look like the ReplicaSet was working because they received information from the Primary.

This will create a problem if you're in this scenario and you shutdown the secondaries because the Primary will be aware that those went offline which will send your Primary into Secondary mode forever and will be unable to recover until you figure out the issue.

A. Smith
  • 446
  • 7
  • 8