Microk8s on RaspberryPI: Prometheus Service Discovery does not correctly find master node as target, duplicates one of the worker nodes

Question

As training platform, I setup a microk8s cluster on 3 RaspberryPI 4s. In principle it's working fine, I can deploy applications, etc. As it's consisting of 3 nodes, it also automatically made high availability availble.

master node 192.168.1.225
worker node1 192.168.1.226
worker node2 192.168.1.227

https://microk8s.io/high-availability is saying that microk8s is using all nodes - even the master - to execute workload. To validate I installed prometheus/grafana stack to see cluster metrics.

However, it looks like that the master node is only correctly discovered by prometheus as target, see screenshots from prometheus GUI below. Disable & re-enable prometheus did not fix the issue.
Obviously, also in Grafana I can also see the worker nodes, not master. It feels a bit like the master node is not treated like worker nodes - however it should be as it provides capacity to cluster.

Any idea how to fix?

kube api targets looks good

kubelet not outlining master node but duplicate worker node

node exporter only on worker nodes

Output of microk8s status:

microk8s statusmicrok8s is running
high-availability: yes
  datastore master nodes: 192.168.1.225:19001 192.168.1.226:19001 192.168.1.227:19001
  datastore standby nodes: none
addons:
  enabled:
    dashboard            # The Kubernetes dashboard
    dns                  # CoreDNS
    ha-cluster           # Configure high availability on the current node
    helm3                # Helm 3 - Kubernetes package manager
    ingress              # Ingress controller for external access
    metallb              # Loadbalancer for your Kubernetes cluster
    metrics-server       # K8s Metrics Server for API access to service metrics
    prometheus           # Prometheus operator for monitoring and logging
  disabled:
    dashboard-ingress    # Ingress definition for Kubernetes dashboard
    helm                 # Helm 2 - the package manager for Kubernetes
    host-access          # Allow Pods connecting to Host services smoothly
    linkerd              # Linkerd is a service mesh for Kubernetes and other frameworks
    openebs              # OpenEBS is the open-source storage solution for Kubernetes
    portainer            # Portainer UI for your Kubernetes cluster
    rbac                 # Role-Based Access Control for authorisation
    registry             # Private image registry exposed on localhost:32000
    storage              # Storage class; allocates storage from host directory
    traefik              # traefik Ingress controller for external access

Pods in Prometheus namespace:

microk8s.kubectl get pods -n monitoring
NAME                                   READY   STATUS    RESTARTS      AGE
node-exporter-dkhks                    2/2     Running   2 (11d ago)   11d
prometheus-adapter-5b7fb5c557-2bbqs    1/1     Running   2 (11d ago)   11d
prometheus-operator-667757c7b9-7ll9v   2/2     Running   4 (24h ago)   11d
alertmanager-main-0                    2/2     Running   4 (24h ago)   11d
node-exporter-qc467                    2/2     Running   4 (24h ago)   11d
grafana-59f6895cb8-28dmn               1/1     Running   2 (24h ago)   11d
blackbox-exporter-5c4d9867d6-57wxv     3/3     Running   6 (24h ago)   11d
prometheus-k8s-0                       2/2     Running   3 (24h ago)   11d
prometheus-adapter-5b7fb5c557-dfx6v    1/1     Running   3 (24h ago)   11d
kube-state-metrics-bbd47c478-4qb54     3/3     Running   7 (24h ago)   11d

try deploying prometheus from scratch. You'll understand what you do, and eventually you'll do it better than the prometheus-operator. project has been in beta for years, their scraping generated configuration is up for debate, ... ServiceMonitors are pretty much useless, until you deal with multi-tenancy -- and even then, there's cleaner ways to do it, with proper RBAC, ... It's a mess you don't want to get into, if it's not too late. Still, you can share your prometheus.yml, especially whatever was generated in the scrape_configs block. — SYN, Feb 16 '22 at 22:44
Also: the node-exporter only has 2 Pods, in your monitoring namespace? That doesn't explain what's going on with kubelet/cadvisor metrics. But you'll be missing other metrics for sure.Check the nodeSelector for the node-exporter DaemonSet, these labels should be present on master node. Check taints on master node. If you find some, try to add tolerations for those taints, into your node-exporter DaemonSet — SYN, Feb 16 '22 at 22:51
I have checked the daemonSet and noticed some strange behaviour of the master node, even taints were ok, no pods were executed. Therefore I made the decision to remove master from cluster, thankfully HA worked as expected. After re-installing microk8s & joining the cluster all 3 nodes working fine, now 3 node-exporters running. still I see duplication but at least grafana shows now the full cluster utilization, eg. mem usage is reasonble. Thanks for the hint — Survis, Feb 18 '22 at 21:06

score 1 · Answer 1 · edited Feb 19 '22 at 11:56

It looks like that after joining a 3rd node, high availability was automatically activited but still original master node was somewhat treated differently (eg. it was node showing up in "get nodes").

After leaving and joining the cluster, all nodes are now working as expected and node operator is executed on all nodes by daemonset. Still worker node1 is somehow duplicated in the prometheus targets, so I will check if rejoin fixes that issue.

microk8s.kubectl get pods -n monitoring

NAME READY STATUS RESTARTS
AGE prometheus-adapter-5b7fb5c557-cqxsw 1/1 Running 0 22h
node-exporter-mxdjw 2/2 Running 0 22h
prometheus-operator-667757c7b9-5724x 2/2 Running 0 22h
prometheus-k8s-0 2/2 Running 1 (22h ago) 22h
blackbox-exporter-5c4d9867d6-ct9q7 3/3 Running 0 22h
alertmanager-main-0 2/2 Running 0 22h
prometheus-adapter-5b7fb5c557-xjn2s 1/1 Running 0 22h
node-exporter-dpkgg 2/2 Running 0 22h
kube-state-metrics-bbd47c478-82vkk 3/3 Running 0 22h
grafana-59f6895cb8-qkx7p 1/1 Running 0 22h
node-exporter-5tgfw 2/2 Running 0 4h45m

Microk8s on RaspberryPI: Prometheus Service Discovery does not correctly find master node as target, duplicates one of the worker nodes

1 Answers1