Weird node number in load balancer setup

Question

I got an odd problem here. I set up Corosync and Pacemaker, I used this guide as a reference but I improvised a little the first time installing it since I'm doing this to learn, not following instructions like a slave. But as I got this odd error, I booted up a new VPS to try again, this time following the instructions like a slave.

Here is the guide I followed, pretty nice set up from Mitchell Anicas over at Digital Ocean: How To Create a High Availability HAProxy Setup with Corosync, Pacemaker, and Floating IPs on Ubuntu 14.04 | digitalocean.com

The errors I have been getting is related to the number of nodes in the cluster. In my settings, I have specified explicitly to do two-node cluster.

OS: Ubuntu Xenial Xursus (16.04.4)

totem {
  version: 2
  cluster_name: lbcluster
  transport: udpu
  interface {
    ringnumber: 0
    bindnetaddr: primary's-privateIP
    broadcast: yes
    mcastport: 5405
  }
}

quorum {
  provider: corosync_votequorum
  two_node: 1
}

nodelist {
  node {
    ring0_addr: primary's-privateIP
    name: primary
    nodeid: 1
  }
  node {
    ring0_addr: secondary's-privateIP
    name: secondary
    nodeid: 2
  }
}

logging {
  to_logfile: yes
  logfile: /var/log/corosync/corosync.log
  to_syslog: yes
  timestamp: on
}

If I run sudo crm status, the output I get looks like this.

Last updated: Fri Apr 13 15:31:47 2018          Last change: Fri Apr 13 14:08:42 2018 by root via cibadmin on secondary<br>
Stack: corosync<br>
Current DC: secondary (version 1.1.14-70404b0) - partition with quorum<br>
3 nodes and 0 resources configured

Online: [ primary secondary ]
OFFLINE: [ sh-ps-02 ]

I also run sudo crm configure show to show the configuration:

node 1: primary<br>
node 2: secondary<br>
node 2130706433: sh-ps-02<br>
property cib-bootstrap-options: \<br>
have-watchdog=false \<br>
dc-version=1.1.14-70404b0 \<br>
cluster-infrastructure=corosync \<br>
cluster-name=debian \<br>
stonith-enabled=false \<br>
no-quorum-policy=ignore

Why is there a weird looking node, with the node name of the secondary node running, but offline even if it's explicitly said that it's a two-node cluster?

Addition 16. April 2018: I ran sudo corosync-cmapctl | grep members to get the members of the cluster, and there is no traces of that weird cluster member that is offline.

runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(x.x.82.204)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 3
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(x.x.82.167)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined

It's not corosync related. You mistake corosync for pacemaker. — kubanczyk, Apr 14 '18 at 09:24
It is corosync that I'm configuring. Most config is at the corosync software, not the pacemaker. The config that dictates if its two-node or not is located at etc/corosync/corosync.conf. Furthermore, I don't see where in my message I point out whether I'm thinking corosync or pacemaker. I'm asking generally not pointing out a specific software. — StianM, Apr 16 '18 at 09:44

score 0 · Accepted Answer · answered Apr 16 '18 at 16:38

0

I believe Xenial is shipping Corosync and Pacemaker started and enabled in systemd, with a corosync.conf configuration that would bring up a "single node cluster". That entry, is likely the hostname of one of your nodes that was added prior to you setting the names: primary and secondary.

To clean it up, simply delete that entry:

# crm node delete sh-ps-02

Side note: naming your nodes primary and secondary isn't a great practice. node-a and node-b would be better, since either node in the cluster should be able to act as "primary" or "secondary".

answered Apr 16 '18 at 16:38

Matt Kereczman

1,899
9
12

Thank you for this suggestions, primary and secondary was the default naming, and I will change it to something more identifiable when I'm done and enter testing stage! I will test your suggestions now, and report back what I find! :) Edit: Thank you very much, this worked out perfectly and I've added this to my notes! +++ For good help! :3 – StianM Apr 17 '18 at 07:49

Weird node number in load balancer setup

1 Answers1