1

I am trying to get heartbeat to work on ec2. I followed the steps in this blog post.

SERVER1
autojoin none
ucast eth0 175.41.181.175
warntime 5
deadtime 15
initdead 60
keepalive 2
crm respawn
node ip-10-130-83-33
node ip-10-130-71-107

SERVER 2
autojoin none
ucast eth0 175.41.182.186
warntime 5
deadtime 15
initdead 60
keepalive 2
crm respawn
node ip-10-130-83-33
node ip-10-130-71-107

When I start the services on both machines I get the below. I verified that the host names were correct using uname -n. I have the same auth key. I tried both private and public ipaddress. Both ec2 machines are on the same zone.

Attempting connection to the cluster........

============
Last updated: Tue Mar 13 18:54:13 2012
Stack: Heartbeat
Current DC: NONE
1 Nodes configured, unknown expected votes
0 Resources configured.
============

Node ip-10-130-71-107 (97639952-508d-4c6b-88bf-a6a92166d41a): UNCLEAN (offline)

============
Last updated: Tue Mar 13 18:52:11 2012
Stack: Heartbeat
Current DC: ip-10-130-83-33 (a16dbec6-2c49-47e1-bbaa-d90bdc39625a) - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
1 Nodes configured, unknown expected votes
0 Resources configured.
============

Online: [ ip-10-130-83-33 ]
mgorven
  • 30,615
  • 7
  • 79
  • 122
Tampa
  • 338
  • 2
  • 8
  • 17
  • It would seem that your two nodes can't communicate with each other. Check that the ports are open: nmap -p 694 -sU -P0 10.xxx.xxx.xxx (and check your security groups); Confirm that the ‘heartbeat’ communications are occuring: tcpdump port 694; Check your log (especially as you start/stop heartbeat on a node): tail -f /var/log/ha-log. Also as a matter of good practise, you usually try to have identical ha.cf files on all servers (include both ucast eth0 lines in your case) - the 'local' address is automatically ignored by heartbeat. – cyberx86 Mar 13 '12 at 22:50

0 Answers0