Kubernetes nodes have unreachable routes

Question

I maintain a Kubernetes cluster. The nodes are in an intranet with 10.0.0.0/8 IPs, and the pod network range is 192.168.0.0/16.

The problem is, some of the worker nodes have unreachable routes to pod networks on other nodes, like:

0.0.0.0         10.a.b.65       0.0.0.0         UG    0      0        0 eth0
10.a.b.64       0.0.0.0         255.255.255.192 U     0      0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
192.168.20.0    -               255.255.255.192 !     0      -        0 -
192.168.21.128  -               255.255.255.192 !     0      -        0 -
192.168.22.64   0.0.0.0         255.255.255.192 U     0      0        0 *
192.168.22.66   0.0.0.0         255.255.255.255 UH    0      0        0 cali3859982c59e
192.168.24.128  -               255.255.255.192 !     0      -        0 -
192.168.39.192  -               255.255.255.192 !     0      -        0 -
192.168.49.192  -               255.255.255.192 !     0      -        0 -
...
192.168.208.128 -               255.255.255.192 !     0      -        0 -
192.168.228.128 10.14.170.104   255.255.255.192 UG    0      0        0 tunl0

When I docker exec into the Calico container, the connections to other nodes are reported unreachable in bird:

192.168.108.64/26  unreachable [Mesh_10_15_39_59 08:04:59 from 10.a.a.a] * (100/-) [i]
192.168.112.128/26 unreachable [Mesh_10_204_89_220 08:04:58 from 10.b.b.b] * (100/-) [i]
192.168.95.192/26  unreachable [Mesh_10_204_30_35 08:04:59 from 10.c.c.c] * (100/-) [i]
192.168.39.192/26  unreachable [Mesh_10_204_89_152 08:04:59 from 10.d.d.d] * (100/-) [i]
...

As a result, the pods on the broken nodes almost can't access anything in the cluster.

I've tried to restart a broken node, remove it from cluster, run kubeadm reset, and re-join it. But all remained the same.

What's the possible cause, and how should I fix this? Many thanks in advance.

score 0 · Answer 1 · answered Dec 07 '20 at 11:09

0

default ip for cluster services like coredns , etc ... is 10.96.0.1 and range 10.0.0.0/8

you should to change the node ip for nodes on the cluster and rejoin them.

or

change the default route from eth0 to tunl0 it depends on your cni network.

if you use calico , give over the network rule and route to calico project.

answered Dec 07 '20 at 11:09

SAEED mohassel

219
1
4

I am not sure if it's a good idea to set `tunl0` as my default gateway. When I run `ping -I tunl0 ` it responds. But when I route this IP to `tunl0`, ping doesn't have a response any more. – Moycat Dec 07 '20 at 12:04
Also, the intranet is of my company. Sorry but I can't change the node IPs. In fact another cluster with default settings works fine. – Moycat Dec 07 '20 at 12:05

score 0 · Answer 2 · answered Dec 09 '20 at 03:18

0

Well, I upgraded Docker (19.03.14), Kubernetes (1.19.4) and Calico (3.17.0).

Then I re-created the cluster.

Now it works well.

answered Dec 09 '20 at 03:18

Moycat

1
2

Kubernetes nodes have unreachable routes

2 Answers2