We are building a two-node SQL Availability Group with SQL Server 2016 SP3.
Steps taken: 1.) Build two VMs in Azure in the same region, but different Zones 2.) Install Windows Failover Cluster on both nodes 3.) Install SQL Server 2016 SP3 on each node 4.) Create a failover cluster with each node and a cloud witness 5.) Enable the failover cluster on the SQL engine service 6.) Create an availability group and add both nodes and a database 7.) Add a listener to the availability group
At this point, we can connect to the listener name if we try from the primary node with SSMS. The DNS entry has been created and assigned the IP address given to the listener.
If I go to node2 and try and connect to the listener name, I get a connection timeout. If I nslookup the correct IP is given.
When I failover from node1 to node2 the connection to the listener stops working on node1 and starts working on node 2.
We have moved node 2 to a separate subnet and still see the same behavior.
I know there are some intricacies with Azure VMs and failover clustering communications, but we have tried the things we have found concerning this.
The only thing we have been hesitant to do is the standard load balancer.
Does anyone have a direction we can look at next?