1

I have deployed a MySQL database (statefulset) on Kubernetes zonal cluster, running as a service (GKE) in Google Cloud Platform.

The zonal cluster consist of 3 instances of type e2-medium.

The MySQL container cannot start due to the following error.

kubectl logs mysql-statefulset-0
2022-02-07 05:55:38+00:00 [Note] [Entrypoint]: Entrypoint script for MySQL Server 5.7.35-1debian10 started.
find: '/var/lib/mysql/': Input/output error

Last seen events.

4m57s   Warning   Ext4Error   gke-cluster-default-pool-rnfh   kernel-monitor, gke-cluster-default-pool-rnfh   EXT4-fs error (device sdb): __ext4_find_entry:1532: inode #2: comm mysqld: reading directory lblock 0   40d   8062   gke-cluster-default-pool-rnfh
3m22s   Warning   BackOff     pod/mysql-statefulset-0   spec.containers{mysql}   kubelet, gke-cluster-default-pool-rnfh   Back-off restarting failed container

Nodes.

kubectl get node -owide
gke-cluster-default-pool-ayqo   Ready    <none>   54d   v1.21.5-gke.1302   So.Me.I.P   So.Me.I.P    Container-Optimized OS from Google   5.4.144+         containerd://1.4.8
gke-cluster-default-pool-rnfh   Ready    <none>   54d   v1.21.5-gke.1302   So.Me.I.P   So.Me.I.P   Container-Optimized OS from Google   5.4.144+         containerd://1.4.8
gke-cluster-default-pool-sc3p   Ready    <none>   54d   v1.21.5-gke.1302   So.Me.I.P   So.Me.I.P     Container-Optimized OS from Google   5.4.144+         containerd://1.4.8

I also noticed that rnfh node is out of memory.

kubectl top node
NAME                            CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
gke-cluster-default-pool-ayqo   117m         12%    992Mi           35%
gke-cluster-default-pool-rnfh   180m         19%    2953Mi          104%
gke-cluster-default-pool-sc3p   179m         19%    1488Mi          52%

MySql mainfest

# HEADLESS SERVICE
apiVersion: v1
kind: Service
metadata:
  name: mysql-headless-service
  labels:
    kind: mysql-headless-service
spec:
  clusterIP: None
  selector:
    tier: mysql-db
  ports:
    - name: 'mysql-http'
      protocol: 'TCP'
      port: 3306
---
# STATEFUL SET
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql-statefulset
spec:
  selector:
    matchLabels:
      tier: mysql-db
  serviceName: mysql-statefulset
  replicas: 1
  template:
    metadata:
      labels:
        tier: mysql-db
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: my-mysql
          image: my-mysql:latest
          imagePullPolicy: Always
          args:
            - "--ignore-db-dir=lost+found"
          ports:
            - name: 'http'
              protocol: 'TCP'
              containerPort: 3306
          volumeMounts:
            - name: mysql-pvc
              mountPath: /var/lib/mysql
          env:
            - name: MYSQL_ROOT_USER
              valueFrom:
                secretKeyRef:
                  name: mysql-secret
                  key: mysql-root-username
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysql-secret
                  key: mysql-root-password
            - name: MYSQL_USER
              valueFrom:
                configMapKeyRef:
                  name: mysql-config
                  key: mysql-username
            - name: MYSQL_PASSWORD
              valueFrom:
                configMapKeyRef:
                  name: mysql-config
                  key: mysql-password
            - name: MYSQL_DATABASE
              valueFrom:
                configMapKeyRef:
                  name: mysql-config
                  key: mysql-database
  volumeClaimTemplates:
    - metadata:
        name: mysql-pvc
      spec:
        storageClassName: 'mysql-fast'
        resources:
          requests:
            storage: 120Gi
        accessModes:
          - ReadWriteOnce
          - ReadOnlyMany

MySQL storage class manifest:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: mysql-fast
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-ssd
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate

Why Kubernetes is trying to schedule pod in out of memory node?

UPDATES

I've added requests and limits to MySQL manifest to improve the Qos Class. Now the Qos Class is Guaranteed.

Unfortunately, Kubernetes still trying to schedule to out of memory rnfh node.

kubectl describe po mysql-statefulset-0 | grep node -i
Node: gke-cluster-default-pool-rnfh/So.Me.I.P

kubectl describe po mysql-statefulset-0 | grep qos -i
QoS Class: Guaranteed
David Maze
  • 130,717
  • 29
  • 175
  • 215
Mikolaj
  • 1,231
  • 1
  • 16
  • 34
  • `I also noticed that rnfh node is out of memory.` - this is **AFTER** your mysql pod was deployed to the node? – gohm'c Feb 07 '22 at 11:40
  • I've tested your yamls/config. Only issue I've encounter was that I couldnt deploy with your image `my-mysql:latest`. Is that your own image? When I've used `image: mysql:5.7` everything is working as expected and last entry from log is `[Note] mysqld: ready for connections.` If you would use `image: mysql:5.7` you also are getting this issue? If this is your custom image, there is possibility that something fell into loop and it's "eating" your resource. With one pod with image mysql:5.7 on e2-medium I have only `754Mi 26%` – PjoterS Feb 07 '22 at 12:15
  • @PjoterS `my-mysql:latest` based on `mysql:5.7` with additional COPY `mysql.cnf` to `/etc/mysql/conf.d/custom.cnf`. In cnf file I defined `default-character-set = utf8mb4` – Mikolaj Feb 07 '22 at 13:05
  • And if you would try to use just `mysql:5.7` image? Do you have more statefulsets or deploymends in this claster? Also what output you will get when you execute `$ kubectl top pods -A | sort`? Maybe there is something else which consumes your memory. I don't see any node affinity or anything similar. Maybe your sql was deployed and later another application started to consume memory. It's whole new cluster? – PjoterS Feb 07 '22 at 13:17
  • @gohm'c The MySQL container was running without any issue for 1 month. Two days ago for reasons unknown to me Kubernetes restarted the container and was keep trying to run it on `rnfa` machine. The container was probably evicted from another node. – Mikolaj Feb 07 '22 at 13:17
  • I deleted all pods from all-namespaces couple times to Kubernetes scheduled the load again across 3 nodes. The MySQL satefulset pod always was thrown to `rnfh` node and always return the error, even if the node had an empty space. – Mikolaj Feb 07 '22 at 13:27
  • Finally I cordoned the `rnfa` node then Kubernetes scheduled MySQL sts pod to different node and everything works fine without any error. – Mikolaj Feb 07 '22 at 13:30

1 Answers1

1

I ran a few more tests but I couldn't replicate this.

To answer this one correctly, we would need much more logs. Not sure if you still have them. If I could guess which was the root cause of this issue I would say it was connected with the PersistentVolume.

In one of the Github issue - Volume was remounted as read only after error #752 I found very similar behavior to OP's behavior.

You have created a special storageclass for your MySQL. You've set reclaimPolicy: Retain so PV was not removed. When Statefulset pod (with the same suffix -0) has been recreated (restarted due to error with connectivity, some issues on DB, hard to say) it tried to re-claim this Volume. In the mentioned Github issue, user had very similar situation. Also got inode #262147: comm mysqld: reading directory lblock issue, but in the bellow there was also entry [ +0.003695] EXT4-fs (sda): Remounting filesystem read-only. Maybe it changed permissions when re-mounted?

Another thing that your volumeClaimTemplates contained

        accessModes:
          - ReadWriteOnce
          - ReadOnlyMany

So one PersistentVolume could be used as ReadWriteOnce by one node or only ReadOnlyMany by many nodes. There is a possibility that POD was recreated in different node with Read-Only assessMode.

[ +35.912075] EXT4-fs warning (device sda): htree_dirblock_to_tree:977: inode #2: lblock 0: comm mysqld: error -5 reading directory block
[  +6.294232] EXT4-fs error (device sda): ext4_find_entry:1436: inode #262147: comm mysqld: reading directory lblock ...
[  +0.005226] EXT4-fs error (device sda): ext4_find_entry:1436: inode #2: comm mysqld: reading directory lblock 0
[  +1.666039] EXT4-fs error (device sda): ext4_journal_check_start:61: Detected aborted journal
[ +0.003695] EXT4-fs (sda): Remounting filesystem read-only

It would fit to OP's comment:

Two days ago for reasons unknown to me Kubernetes restarted the container and was keep trying to run it on rnfa machine. The container was probably evicted from another node.

Another thing is that node or cluster might be updated (depending if the auto update option was turned on) which might enforce restart of the pod.

Issue with '/var/lib/mysql/': Input/output error might point to database corruption like mentioned here.

In general, the issue has been resolved by cordoning affected node. Additional information about the difference between cordon and drain can be found here.

Just as an addition, to assign pods to specific node or node with specified label, you can use Affinity

PjoterS
  • 12,841
  • 1
  • 22
  • 54