I've deployed several different services and always get the same error.
The service is reachable on the node port from the machine where the pod is running. On the two other nodes I get timeouts.
The kube-proxy is running on all worker nodes and I can see in the logfiles from kube-proxy that the service port was added and the node port was opened. In this case I've deployed the stars demo from calico
Kube-proxy log output:
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.229458 659 service.go:309] Adding new service port "management-ui/management-ui:" at 10.32.0.133:9001/TCP
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.257483 659 proxier.go:1427] Opened local port "nodePort for management-ui/management-ui:" (:30002/tcp)
The kube-proxy is listening on the port 30002
root@kuben1:/tmp# netstat -lanp | grep 30002
tcp6 0 0 :::30002 :::* LISTEN 659/kube-proxy
There are also some iptable rules defined:
root@kuben1:/tmp# iptables -L -t nat | grep management-ui
KUBE-MARK-MASQ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-MARK-MASQ tcp -- !10.200.0.0/16 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
The interesting part is that I can reach the service IP from any worker node
root@kubem1:/tmp# kubectl get svc -n management-ui
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
management-ui NodePort 10.32.0.133 <none> 9001:30002/TCP 52m
The service IP/port can be accessed from any worker node if I do a "curl http://10.32.0.133:9001"
I don't understand why kube-proxy does not "route" this properly...
Has anyone a hint where I can find the error?
Here some cluster specs:
This is a hand build cluster inspired by Kelsey Hightower's "kubernetes the hard way" guide.
- 6 Nodes (3 master: 3 worker) local vms
- OS: Ubuntu 18.04
- K8s: v1.13.0
- Docker: 18.9.3
- Cni: calico
Component status on the master nodes looks okay
root@kubem1:/tmp# kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
The worker nodes are looking okay if I trust kubectl
root@kubem1:/tmp# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kuben1 Ready <none> 39d v1.13.0 192.168.178.77 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben2 Ready <none> 39d v1.13.0 192.168.178.78 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben3 Ready <none> 39d v1.13.0 192.168.178.79 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
As asked by P Ekambaram:
root@kubem1:/tmp# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-bgjdg 1/1 Running 5 40d
calico-node-nwkqw 1/1 Running 5 40d
calico-node-vrwn4 1/1 Running 5 40d
coredns-69cbb76ff8-fpssw 1/1 Running 5 40d
coredns-69cbb76ff8-tm6r8 1/1 Running 5 40d
kubernetes-dashboard-57df4db6b-2xrmb 1/1 Running 5 40d