0

After a DO update to 1.24.16-do.0 , my cloud-config pod encountered Liveness probe failed: Get-<http.ip> context deadline exceeded (Client.Timeout exceeded while awaiting headers). When i curl the ip, i'm getting a 137 error, as the pod is backing off i suppose. The traffic is very less and the memory/cpu/threads is much beyond limits thresholds. The issue is reproduced on different cluster compute nodes. Any other resources were also not changed during the update. deployment of my liveness-probe

http-get http://:8888/actuator/health delay=180s 

the image is being pulled by an internal registry and works, too. I also tried to disable all the components that being checked as part of the actuator health check, but nothing has changed. liveness config:

livenessProbe:
      failureThreshold: 4
      httpGet:
        path: /actuator/health
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 180
      periodSeconds: 15
      successThreshold: 1
      timeoutSeconds: 5

i'd be gratefule for any hint

fipse
  • 15
  • 5
  • Care to share your yaml configuration for the pod? specially the liveness/readiness section – DanF Aug 31 '23 at 18:28
  • @DanF i edited my inital question – fipse Aug 31 '23 at 19:16
  • hmm can you try giving it more cpu resources? and increase timeoutSeconds to something like 15secs – DanF Aug 31 '23 at 21:04
  • yeah, i will try your approach and hope it solves this thing – fipse Sep 01 '23 at 07:32
  • Well @DanF it seems, the new DO Update messes all the resources (limits) up, so there is no other way than deleting these or temporarily don't use them until digital-ocean fixes this asap. – fipse Sep 01 '23 at 08:58

0 Answers0