0

I'm using AKS, I am trying to create a statefulset with pvc from yaml. It looks like pvc created successfully and is bound. but I see that the pod state is CrashLoopBackOff When I use describe on the pod, i get this events:

  Type     Reason                  Age                    From                                        Message
  ----     ------                  ----                   ----                                        -------
  Warning  FailedScheduling        38m (x2 over 38m)      default-scheduler                           pod has unbound immediate PersistentVolumeClaims (repeated 2 times)
  Normal   Scheduled               38m                    default-scheduler                           Successfully assigned default/janusgraph-test3-0 to aks-agentpool-26199593-vmss000000
  Normal   SuccessfulAttachVolume  37m                    attachdetach-controller                     AttachVolume.Attach succeeded for volume "pvc-00b88841-a21d-430c-9f2f-b65307b156c2"
  Normal   Pulled                  34m (x4 over 37m)      kubelet, aks-agentpool-26199593-vmss000000  Successfully pulled image "janusgraph/janusgraph:latest"
  Normal   Created                 34m (x4 over 37m)      kubelet, aks-agentpool-26199593-vmss000000  Created container janusgraph-test3
  Normal   Started                 34m (x4 over 37m)      kubelet, aks-agentpool-26199593-vmss000000  Started container janusgraph-test3
  Normal   Pulling                 32m (x5 over 37m)      kubelet, aks-agentpool-26199593-vmss000000  Pulling image "janusgraph/janusgraph:latest"
  Warning  BackOff                 2m42s (x124 over 36m)  kubelet, aks-agentpool-26199593-vmss000000  Back-off restarting failed container

The PVC is:

  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: "default"
        resources:
          requests:
            storage: 7Gi

When I run descibe on the PVC, i get this event (which means all GOOD!):

  Type       Reason                 Age   From                         Message
  ----       ------                 ----  ----                         -------
  Normal     ProvisioningSucceeded  19m   persistentvolume-controller  Successfully provisioned volume pvc-00b88841-a21d-430c-9f2f-b65307b156c2 using kubernetes.io/azure-disk

here is the full describe pvc info:

Name:          data-janusgraph-test3-0
Namespace:     default
StorageClass:  default
Status:        Bound
Volume:        pvc-00b88841-a21d-430c-9f2f-b65307b156c2
Labels:        app=janusgraph-test3
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/azure-disk
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      7Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Events:
  Type       Reason                 Age   From                         Message
  ----       ------                 ----  ----                         -------
  Normal     ProvisioningSucceeded  19m   persistentvolume-controller  Successfully provisioned volume pvc-00b88841-a21d-430c-9f2f-b65307b156c2 using kubernetes.io/azure-disk
Mounted By:  janusgraph-test3-0

Based on the above info, I have no clue what really went wrong, when I look at similar issues on the web, I can find that usually it related to different read/write access, but obviously this is not the case here, since there is no error regarding this one. and in addition, I already created 2 more stateful sets in my AKS that use the same type of configuration, just different statefulset names.

---- Update: In addition, running kubectl logs on the pod is showing this:

waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
waiting for storage...
Error occurred during initialization of VM
agent library failed to init: instrument
Error opening zip file or JAR manifest missing : /var/lib/janusgraph/jmx_prometheus_javaagent-0.13.0.jar

As you can see now, the container storage is not really attached for some reason, (the error of the JAR is just a side effect I guess). any idea?

toto
  • 1,197
  • 2
  • 15
  • 26
  • 2
    Looking at the timestamps in the `kubectl describe statefulset` output, the storage is fine (`SuccessfulVolumeAttach` happened 37 minutes ago) but the container is failing for some reason (4 attempts at `Started`, most recently 34 minutes ago). `kubectl logs janusgraph-test3-0` might tell you something. – David Maze Jan 19 '21 at 12:20
  • Could you post `kube-controller-manager` pod logs at the very moment you are trying to create the statefulset resources? – Daniel Marques Jan 19 '21 at 12:47
  • @DavidMaze That is actually the solution... I updated my question with the logs. I also being able to access the container and see the mount point with the actual files, so I guess something is wrong in my container :/ Thanks! – toto Jan 19 '21 at 13:08

1 Answers1

1

Pods being in CrashLoopBackOff state means there is some problem with the scripts that is being run inside the container.

Check for output for kubectl logs command to see why the pods are crashing.

Krishna Chaurasia
  • 8,924
  • 6
  • 22
  • 35