etcd DB cluster on kubernetes misbehaving

Question

In my project we have etcd DB deployed on Kubernetes (this etcd is for application use, separate from the Kubernetes etcd) on on-prem. So I deployed it using the bitnami helm chart as a statefulset. Initially, at the time of deployment, the number of replicas was 1 as we wanted a single instance of etcd DB earlier.

The real problem started when we scaled it up to 3. I updated configuration to scale it up by updating the ETCD_INITIAL_CLUSTER with two new members DNS name:

etcd-0=http://etcd-0.etcd-headless.wallet.svc.cluster.local:2380,etcd-1=http://etcd-1.etcd-headless.wallet.svc.cluster.local:2380,etcd-2=http://etcd-2.etcd-headless.wallet.svc.cluster.local:2380

Now when I go inside any of etcd pod and run etcdctl member list I only get a list of member and none of them is selected as leader, which is wrong. One among three should be the leader.

Also after running for some time these pods start giving heartbeat exceeds error and server overload error:

W |  etcdserver: failed to send out heartbeat on time (exceeded the 950ms timeout for 593.648512ms, to a9b7b8c4e027337a
W | etcdserver: server is likely overloaded
W | wal: sync duration of 2.575790761s, expected less than 1s

I changed the heartbeat default value accordingly, the number of errors decreased but still, I get a few heartbeat exceed errors along with others.

Not sure what is the problem here, is it the i/o that's causing the problem? If yes I am not sure how to be sure.

Will really appreciate any help on this.

score 1 · Answer 1 · answered Jul 21 '20 at 20:29

1

I don't think the heartbeats are the main problem, it also seems the logs that you are seeing are Warning logs. So it's possible that some heartbeats are missed here and there but your nodes are node(s) are not crashing or mirroring.

It's likely that you changed the replica numbers and your new replicas are not joining the cluster. So, I would recommend following this guide for you to add the new members to the cluster. Basically with etcdctl something like this:

etcdctl member add node2 --peer-urls=http://node1:2380
etcdctl member add node3 --peer-urls=http://node1:2380,http://node2:2380

Note that you will have to run these commands in a pod that has access to all your etcd nodes in your cluster.

You could also consider managing your etcd cluster with the etcd operator which should be able to take care of the scaling and removal/addition of nodes.

✌️

answered Jul 21 '20 at 20:29

Rico

58,485
12
111
141

Thank you for replying, really appreciate it. You said it might be the case where the replicas are not joining the cluster. But i can see new members when i run "etcdctl member list command", do i still need to add the member using the command you have mentioned above? – Mr Kashyap Jul 22 '20 at 04:44
if the members are already there.. then don't. In your original question, you mentioned that the members were not joining. – Rico Jul 22 '20 at 04:54
actually the problem is no leader election, i am checking which node is elected as leader by running the same command 'etcdctl member list' and no one is elected as leader. There are few errors in pod startup: 2020-07-21 06:01:59.619929 E | etcdserver: publish error: etcdserver: request timed out 2020-07-21 06:02:06.622020 W | rafthttp: lost the TCP streaming connection with peer a9b7b8c4e027337a (stream Message reader) 2020-07-21 06:02:06.629586 E | rafthttp: failed to dial a9b7b8c4e027337a on stream MsgApp v2 (peer a9b7b8c4e027337a failed to find local node ea673ac594ba66e2) – Mr Kashyap Jul 22 '20 at 05:58

score 0 · Answer 2 · answered Jul 30 '20 at 12:20

Okay, I had two problems:

"failed to send out heartbeat" Warning messages.
"No leader election".

Next day i found out the reason of second problem, actually i had startup parameter set in the pod definition. ETCDCTL_API: 3

so when i run "etcdctl member list" with APIv3 it doesn't mention which member is selected as reader.

$ ETCDCTL_API=3 etcdctl member list
    
    3d0bc1a46f81ecd9, started, etcd-2, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2379, false
    b6a5d762d566708b, started, etcd-1, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2379, false


$ ETCDCTL_API=2 etcdctl member list
    
    3d0bc1a46f81ecd9, started, etcd-2, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-2.etcd-headless.wallet.svc.cluster.local:2379, false
    b6a5d762d566708b, started, etcd-1, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2380, http://etcd-1.etcd-headless.wallet.svc.cluster.local:2379, true

So when i use APIv2 i can see which node is elected as leader and there were no problem with leader election. Still working on heartbeat warning but i guess i need to tune the config in order to avoied that.

NB: I have 3 nodes, stopped one for testing.

If you are using Bitnami helm charts then make sure you are using /opt/bitnami/etcd/data directory if not then specify the default etcd data directory in startup env parameters of pod and volume mounts. — Mr Kashyap, Aug 18 '20 at 06:23

etcd DB cluster on kubernetes misbehaving

2 Answers2