0

My team has a deployment process that involves using Spring's bootBuildImage to create the application image, then we deploy it to our Kubernetes cluster. Last week the deployment was working correctly but it suddenly stopped working.

When we try to deploy a new version, the image is created correctly but the pod doesn't get Ready because the actuator endpoints return a 404 error.

curl localhost:8080/actuator/health/liveness
<!doctype html>
<html lang="en">
    <head>
        <title>HTTP Status 404 – Not Found</title>
        <style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style>
    </head>
    <body>
        <h1>HTTP Status 404 – Not Found</h1>
    </body>
</html>

I already added the health properties (even though we hadn't needed them before)

management.endpoint.health.probes.enabled=true
management.health.livenessState.enabled=true
management.health.readinessState.enabled=true

This is what Kubernetes says

Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Tue, 07 Feb 2023 01:23:27 -0600
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:8080/actuator/health/liveness delay=60s timeout=1s period=30s #success=1 #failure=3
    Readiness:      http-get http://:8080/actuator/health/readiness delay=60s timeout=1s period=10s #success=1 #failure=3
...
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  2m30s              default-scheduler  Successfully assigned podname to ip-10-8-x-x.ap-northeast-1.compute.internal
  Normal   Pulled     2m29s              kubelet            Container image "registry.gitlab.com/integration-image" already present on machine
  Normal   Created    2m29s              kubelet            Created container ui-backend
  Normal   Started    2m29s              kubelet            Started container ui-backend
  Warning  Unhealthy  30s (x2 over 60s)  kubelet            Liveness probe failed: HTTP probe failed with statuscode: 404
  Warning  Unhealthy  10s (x8 over 80s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 404

I've also tried:

  • extending the checks time
  • changing the port of the actuator
  • disabling spring tasks

In all of these cases spring starts but gets restarted by kubernetes after failing the readiness checks

Setting Active Processor Count to 2
Adding $JAVA_OPTS to $JAVA_TOOL_OPTIONS
Calculating JVM memory based on 5201M available memory
For more information on this calculation, see https://paketo.io/docs/reference/java-reference/#memory-calculator
Calculated JVM Memory Configuration: -XX:MaxDirectMemorySize=10M -Xmx4528339K -XX:MaxMetaspaceSize=285484K -XX:ReservedCodeCacheSize=240M -Xss1M (Total Memory: 5201M, Thread Count: 250, Loaded Class Count: 47989, Headroom: 0%)
Enabling Java Native Memory Tracking
Adding 124 container CA certificates to JVM truststore
NOTE: Picked up JDK_JAVA_OPTIONS:  --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.rmi/sun.rmi.transport=ALL-UNNAMED
Picked up JAVA_TOOL_OPTIONS: -Djava.security.properties=/layers/paketo-buildpacks_bellsoft-liberica/java-security-properties/java-security.properties -XX:+ExitOnOutOfMemoryError -XX:ActiveProcessorCount=2 -Dspring.profiles.active=dev -Dspring.config.additional-location=optional:file:/config/,optional:file:/secret/ -Dsentry.environment=dev -Dhoneycomb.config.file=/secret/application.properties -javaagent:/workspace/WEB-INF/lib/honeycomb-opentelemetry-javaagent-1.0.0.jar -XX:MaxDirectMemorySize=10M -Xmx4528339K -XX:MaxMetaspaceSize=285484K -XX:ReservedCodeCacheSize=240M -Xss1M -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics
[otel.javaagent 2023-02-02 03:27:59:129 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: honeycomb-1.0.0-otel-1.12.0
[CONTAINER] org.apache.coyote.http11.Http11NioProtocol         INFO    Initializing ProtocolHandler ["http-nio-8080"]
[CONTAINER] org.apache.catalina.startup.Catalina               INFO    Server initialization in [968] milliseconds
[CONTAINER] org.apache.catalina.core.StandardService           INFO    Starting service [Catalina]
[CONTAINER] org.apache.catalina.core.StandardEngine            INFO    Starting Servlet engine: [Apache Tomcat/9.0.71]
[CONTAINER] org.apache.catalina.startup.HostConfig             INFO    Deploying web application directory [/layers/paketo-buildpacks_apache-tomcat/catalina-base/webapps/ROOT]
[CONTAINER] org.apache.jasper.servlet.TldScanner               INFO    At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
[CONTAINER] lina.core.ContainerBase.[Catalina].[localhost].[/] INFO    2 Spring WebApplicationInitializers detected on classpath
{"timestamp":"2023-02-02T03:28:15.334Z","level":"INFO","thread":"main","logger":"app","message":"Starting UiApplication using Java 11.0.18 on ui-backend-7b6ddf8f58-m8vtq with PID 1 (/workspace/WEB-INF/classes started by cnb in /workspace)","context":"default","nanotime":27377774090677176}
{"timestamp":"2023-02-02T03:28:15.388Z","level":"DEBUG","thread":"main","logger":"app","message":"Running with Spring Boot v2.5.8, Spring v5.3.14","context":"default","nanotime":27377774144285955}
{"timestamp":"2023-02-02T03:28:15.388Z","level":"INFO","thread":"main","logger":"app","message":"The following profiles are active: dev","context":"default","nanotime":27377774145291720}
[CONTAINER] lina.core.ContainerBase.[Catalina].[localhost].[/] INFO    Initializing Spring embedded WebApplicationContext
{"timestamp":"2023-02-02T03:28:22.755Z","level":"WARN","thread":"main","logger":"org.springframework.boot.autoconfigure.orm.jpa.JpaBaseConfiguration$JpaWebConfiguration","message":"spring.jpa.open-in-view is enabled by default. Therefore, database queries may be performed during view rendering. Explicitly configure spring.jpa.open-in-view to disable this warning","context":"default","nanotime":27377781511405545}
{"timestamp":"2023-02-02T03:28:38.660Z","level":"INFO","thread":"main","logger":"app.route.ui.report.ReportTemplateRegistry","message":"report template registry -> templates -> {receipt_ja=app.route.ui.report.ReceiptReportTemplate@6dc0e992, invoice_ja=app.route.ui.report.InvoiceReportTemplate@6e221246}","context":"default","nanotime":27377797416315839}
{"timestamp":"2023-02-02T03:28:38.675Z","level":"DEBUG","thread":"main","logger":"app.route.ui.service.ReportService","message":"report service -> initialize -> start","context":"default","nanotime":27377797431868400}
{"timestamp":"2023-02-02T03:28:38.721Z","level":"INFO","thread":"main","logger":"app.route.ui.service.ReportService","message":"report service -> initialize -> cache created succefully","context":"default","nanotime":27377797477131943}
{"timestamp":"2023-02-02T03:28:38.721Z","level":"DEBUG","thread":"main","logger":"app.route.ui.service.ReportService","message":"report service -> initialize -> end","context":"default","nanotime":27377797481925043}
[CONTAINER] freemarker.configuration                           SEVERE  DefaultObjectWrapper.incompatibleImprovements was set to the object returned by Configuration.getVersion(). That defeats the purpose of incompatibleImprovements, and makes upgrading FreeMarker a potentially breaking change. Also, this probably won't be allowed starting from 2.4.0. Instead, set incompatibleImprovements to the highest concrete version that's known to be compatible with your application.
{"timestamp":"2023-02-02T03:28:40.289Z","level":"INFO","thread":"main","logger":"app.route.ui.entity.storage.EntityStorageRegistry","message":"entity storage registry -> interceptors -> [app.route.ui.entity.interceptor.UserInterceptor@3a94d628]","context":"default","nanotime":27377799045261311}
{"timestamp":"2023-02-02T03:28:41.091Z","level":"INFO","thread":"main","logger":"org.springframework.security.web.DefaultSecurityFilterChain","message":"Will secure Ant [pattern='/actuator/health/**'] with []","context":"default","nanotime":27377799847478125}
{"timestamp":"2023-02-02T03:28:41.092Z","level":"INFO","thread":"main","logger":"org.springframework.security.web.DefaultSecurityFilterChain","message":"Will secure Ant [pattern='/h2-console/**'] with []","context":"default","nanotime":27377799848958832}
{"timestamp":"2023-02-02T03:28:41.182Z","level":"INFO","thread":"main","logger":"org.springframework.security.web.DefaultSecurityFilterChain","message":"Will secure Or [Ant [pattern='/admin/user/webhook'], Ant [pattern='/v1/backups/webhoook/**']] with [org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter@7843b634, org.springframework.security.web.context.SecurityContextPersistenceFilter@2f514909, org.springframework.security.web.header.HeaderWriterFilter@22c59a39, org.springframework.security.web.authentication.logout.LogoutFilter@395a9cdb, app.route.ui.security.internal.ApiKeyAuthFilter@3d042a50, org.springframework.security.web.savedrequest.RequestCacheAwareFilter@1404f97, org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter@50d8cc71, org.springframework.security.web.authentication.AnonymousAuthenticationFilter@63b040d5, org.springframework.security.web.session.SessionManagementFilter@75c3b5ab, org.springframework.security.web.access.ExceptionTranslationFilter@763630ac, org.springframework.security.web.access.intercept.FilterSecurityInterceptor@7753b2e8]","context":"default","nanotime":27377799938730993}
{"timestamp":"2023-02-02T03:28:41.194Z","level":"INFO","thread":"main","logger":"org.springframework.security.web.DefaultSecurityFilterChain","message":"Will secure any request with [org.springframework.security.web.context.request.async.WebAsyncManagerIntegrationFilter@1c2cee82, org.springframework.security.web.context.SecurityContextPersistenceFilter@3c3e87b8, org.springframework.security.web.header.HeaderWriterFilter@58e130bc, org.springframework.web.filter.CorsFilter@7c067793, org.keycloak.adapters.springsecurity.filter.KeycloakPreAuthActionsFilter@43c49d4c, org.keycloak.adapters.springsecurity.filter.KeycloakAuthenticationProcessingFilter@233b20ed, org.springframework.security.web.authentication.logout.LogoutFilter@4eb259f, org.springframework.security.web.savedrequest.RequestCacheAwareFilter@465b750a, org.springframework.security.web.servletapi.SecurityContextHolderAwareRequestFilter@176ed9ad, org.keycloak.adapters.springsecurity.filter.KeycloakSecurityContextRequestFilter@1de33861, org.keycloak.adapters.springsecurity.filter.KeycloakAuthenticatedActionsFilter@41dfea3, org.springframework.security.web.authentication.AnonymousAuthenticationFilter@25cb293e, org.springframework.security.web.session.SessionManagementFilter@76a5b602, org.springframework.security.web.access.ExceptionTranslationFilter@160d925f, org.springframework.security.web.access.intercept.FilterSecurityInterceptor@3a5e6480]","context":"default","nanotime":27377799956811708}
{"timestamp":"2023-02-02T03:28:42.964Z","level":"INFO","thread":"main","logger":"app","message":"Started UiApplication in 29.288 seconds (JVM running for 44.237)","context":"default","nanotime":27377801720328782}
{"timestamp":"2023-02-02T03:28:43.156Z","level":"DEBUG","thread":"main","logger":"app.route.ui.interceptor.RequestLogFilter","message":"Filter 'requestLogFilter' configured for use","context":"default","nanotime":27377801912945502}
{"timestamp":"2023-02-02T03:28:43.160Z","level":"DEBUG","thread":"main","logger":"app.route.ui.security.UiAuthenticationFilter","message":"Filter 'uiAuthenticationFilter' configured for use","context":"default","nanotime":27377801916847463}
[CONTAINER] org.apache.catalina.startup.HostConfig             INFO    Deployment of web application directory [/layers/paketo-buildpacks_apache-tomcat/catalina-base/webapps/ROOT] has finished in [38,936] ms
[CONTAINER] org.apache.coyote.http11.Http11NioProtocol         INFO    Starting ProtocolHandler ["http-nio-8080"]
[CONTAINER] org.apache.catalina.startup.Catalina               INFO    Server startup in [39138] milliseconds
[CONTAINER] org.apache.coyote.http11.Http11NioProtocol         INFO    Pausing ProtocolHandler ["http-nio-8080"]
[CONTAINER] org.apache.catalina.core.StandardService           INFO    Stopping service [Catalina]
[CONTAINER] lina.core.ContainerBase.[Catalina].[localhost].[/] INFO    Closing Spring root WebApplicationContext

I'm not sure if this is a spring or a Kubernetes problem. Thank you.

Edit: Other things that I've tried are:

  • draining the nodes
  • running the images with kubectl run which starts spring, and I can reach the actuator endpoints but can't do anything else.

Nodes description.

  • Node where at least the previous deployment still works
Name:               goodnode.ap-northeast-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5.large
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=ap-northeast-1
                    failure-domain.beta.kubernetes.io/zone=ap-northeast-1a
                    k8s.io/cloud-provider-aws=86af69e954aa5c8fc905ce14364ec305
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=goodnode.ap-northeast-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m5.large
                    topology.kubernetes.io/region=ap-northeast-1
                    topology.kubernetes.io/zone=ap-northeast-1a
Annotations:        node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 22 Mar 2022 00:32:53 -0600
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  goodnode.ap-northeast-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 07 Feb 2023 13:13:59 -0600
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Tue, 07 Feb 2023 13:11:07 -0600   Tue, 22 Mar 2022 00:32:51 -0600   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Tue, 07 Feb 2023 13:11:07 -0600   Tue, 22 Mar 2022 00:32:51 -0600   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Tue, 07 Feb 2023 13:11:07 -0600   Tue, 22 Mar 2022 00:32:51 -0600   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Tue, 07 Feb 2023 13:11:07 -0600   Tue, 22 Mar 2022 00:33:23 -0600   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   goodnodeip
  Hostname:     goodnode.ap-northeast-1.compute.internal
  InternalDNS:  goodnode.ap-northeast-1.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         2
  ephemeral-storage:           52416492Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7934440Ki
  pods:                        29
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         1930m
  ephemeral-storage:           47233297124
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7244264Ki
  pods:                        29
System Info:
  Machine ID:                 ec26c540fc98771f
  System UUID:                ec26c540-fc98-77
  Boot ID:                    034cb48b-8129-40
  Kernel Version:             5.4.181-99.354.amzn2.x86_64
  OS Image:                   Amazon Linux 2
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://20.10.7
  Kubelet Version:            v1.21.5-eks-9017834
  Kube-Proxy Version:         v1.21.5-eks-9017834
ProviderID:                   aws:///ap-northeast-1a/i-01a03da4ffb56c6f0
Non-terminated Pods:          (11 in total)
  Namespace                   Name                                                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                                               ------------  ----------  ---------------  -------------  ---
  amazon-cloudwatch           cloudwatch-agent-b2cxj                                             200m (10%)    200m (10%)  200Mi (2%)       200Mi (2%)     322d
  amazon-cloudwatch           fluent-bit-f825k                                                   500m (25%)    0 (0%)      100Mi (1%)       200Mi (2%)     322d
  webapp                application A                            500m (25%)    1 (51%)     2G (26%)         4G (53%)       12h
  webapp                application B                                  0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  webapp                APP_WITH_PROBLEMS                                         0 (0%)        0 (0%)      0 (0%)           0 (0%)         11h
  kube-system                 alb-ingress-controller-aws-load-balancer-controller-76df65d6qrx    0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  kube-system                 alb-ingress-controller-aws-load-balancer-controller-76df65kwmss    0 (0%)        0 (0%)      0 (0%)           0 (0%)         12h
  kube-system                 aws-node-g7bp7                                                     10m (0%)      0 (0%)      0 (0%)           0 (0%)         322d
  kube-system                 coredns-54bc78bc49-6bdv6                                           100m (5%)     0 (0%)      70Mi (0%)        170Mi (2%)     12h
  kube-system                 coredns-54bc78bc49-vh2rv                                           100m (5%)     0 (0%)      70Mi (0%)        170Mi (2%)     12h
  kube-system                 kube-proxy-lt9bq                                                   100m (5%)     0 (0%)      0 (0%)           0 (0%)         322d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests         Limits
  --------                    --------         ------
  cpu                         1510m (78%)      1200m (62%)
  memory                      2403685Ki (33%)  4664010Ki (64%)
  ephemeral-storage           0 (0%)           0 (0%)
  hugepages-1Gi               0 (0%)           0 (0%)
  hugepages-2Mi               0 (0%)           0 (0%)
  attachable-volumes-aws-ebs  0                0
Events:                       <none>
  • Node where neither the previous deployment nor the new ones work.
Name:               badnode.ap-northeast-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5.large
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=ap-northeast-1
                    failure-domain.beta.kubernetes.io/zone=ap-northeast-1c
                    k8s.io/cloud-provider-aws=86af69e954aa5c8fc905ce14364ec305
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=badnode.ap-northeast-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m5.large
                    topology.kubernetes.io/region=ap-northeast-1
                    topology.kubernetes.io/zone=ap-northeast-1c
Annotations:        node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 22 Mar 2022 00:25:27 -0600
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  badnode.ap-northeast-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 07 Feb 2023 13:14:01 -0600
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Tue, 07 Feb 2023 13:12:59 -0600   Tue, 22 Mar 2022 00:25:25 -0600   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Tue, 07 Feb 2023 13:12:59 -0600   Tue, 22 Mar 2022 00:25:25 -0600   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Tue, 07 Feb 2023 13:12:59 -0600   Tue, 22 Mar 2022 00:25:25 -0600   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Tue, 07 Feb 2023 13:12:59 -0600   Tue, 22 Mar 2022 00:25:57 -0600   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   badnodeip
  Hostname:     badnode.ap-northeast-1.compute.internal
  InternalDNS:  badnode.ap-northeast-1.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         2
  ephemeral-storage:           52416492Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7934440Ki
  pods:                        29
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         1930m
  ephemeral-storage:           47233297124
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      7244264Ki
  pods:                        29
System Info:
  Machine ID:                 ec28188afbf50b034
  System UUID:                ec28188a-fbf5-0b0
  Boot ID:                    316e2549-9735-4fe4
  Kernel Version:             5.4.181-99.354.amzn2.x86_64
  OS Image:                   Amazon Linux 2
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://20.10.7
  Kubelet Version:            v1.21.5-eks-9017834
  Kube-Proxy Version:         v1.21.5-eks-9017834
ProviderID:                   aws:///ap-northeast-1c/i-056ee5ae8f99b7076
Non-terminated Pods:          (4 in total)
  Namespace                   Name                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                      ------------  ----------  ---------------  -------------  ---
  amazon-cloudwatch           cloudwatch-agent-v7kk2    200m (10%)    200m (10%)  200Mi (2%)       200Mi (2%)     322d
  amazon-cloudwatch           fluent-bit-mfhxk          500m (25%)    0 (0%)      100Mi (1%)       200Mi (2%)     322d
  kube-system                 aws-node-qsgkm            10m (0%)      0 (0%)      0 (0%)           0 (0%)         322d
  kube-system                 kube-proxy-nlgxk          100m (5%)     0 (0%)      0 (0%)           0 (0%)         322d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests    Limits
  --------                    --------    ------
  cpu                         810m (41%)  200m (10%)
  memory                      300Mi (4%)  400Mi (5%)
  ephemeral-storage           0 (0%)      0 (0%)
  hugepages-1Gi               0 (0%)      0 (0%)
  hugepages-2Mi               0 (0%)      0 (0%)
  attachable-volumes-aws-ebs  0           0
Events:                       <none>
resource "kubernetes_deployment" "ui_backend" {
  metadata {
    name      = local.name
    namespace = local.namespace
    labels = {
      "app.kubernetes.io/name"       = local.name
      "app.kubernetes.io/managed-by" = local.managed_by
    }
  }

  spec {
    replicas = 1

    selector {
      match_labels = {
        "app.kubernetes.io/name"       = local.name
        "app.kubernetes.io/managed-by" = local.managed_by
      }
    }

    template {
      metadata {
        labels = {
          "app.kubernetes.io/name"       = local.name
          "app.kubernetes.io/managed-by" = local.managed_by
        }
      }

      spec {
        service_account_name = local.name

        image_pull_secrets {
          name = "gitlab-auth"
        }

        container {
          image = var.docker_image
          name  = local.name
          env {
            name  = "JAVA_OPTS"
            value = "-Dspring.profiles.active=${var.environment} "
          }

          port {
            container_port = local.application_port
          }

          liveness_probe {
            http_get {
              path = "/actuator/health/liveness"
              port = local.application_port
            }

            initial_delay_seconds = 60
            period_seconds        = 30
          }

          readiness_probe {
            http_get {
              path = "/actuator/health/readiness"
              port = local.application_port
            }

            initial_delay_seconds = 60
            period_seconds        = 10
          }

        }
      }
    }
  }
}

logs with logging.level.root=DEBUG

{"timestamp":"2023-02-08T21:28:40.448Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.availability.ApplicationAvailabilityBean","message":"Application availability state LivenessState changed to CORRECT","context":"default","nanotime":142674280192537}
{"timestamp":"2023-02-08T21:28:40.451Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.devtools.restart.Restarter","message":"Creating new Restarter for thread Thread[main,5,main]","context":"default","nanotime":142674282973006}
{"timestamp":"2023-02-08T21:28:40.454Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.availability.ApplicationAvailabilityBean","message":"Application availability state ReadinessState changed to ACCEPTING_TRAFFIC","context":"default","nanotime":142674286139388}
{"timestamp":"2023-02-08T21:28:40.496Z","level":"DEBUG","thread":"main","logger":"org.springframework.boot.web.servlet.filter.OrderedRequestContextFilter","message":"Filter 'requestContextFilter' configured for use","context":"default","nanotime":142674328227052}
  • What are the differences between the versions? You actually left out the intersting bits in your logging (which is the information about the start (or non-start) of the application). – M. Deinum Feb 07 '23 at 08:36
  • could you post the full pod config yaml? is port 8080 exposed in the pod config? – SimGel Feb 07 '23 at 09:58
  • have you checked the context path: https://stackoverflow.com/questions/55877188/spring-boot-actuator-returns-404-not-found – SimGel Feb 07 '23 at 10:07
  • @M.Deinum I added the description of the nodes, apart from the pods they're not different. I also added the full log – Ricardo Rubik Ruiz Feb 07 '23 at 20:22
  • @SimGel yes, te port 8080 is exposed, added the config yaml. – Ricardo Rubik Ruiz Feb 07 '23 at 20:30
  • The app takes some time to start about 30 seconds, is that a problem? You could enable full on debug logging `logging.level.root=DEBUG` or at least for the web stuff `logging.level.web=DEBUG` that will give some information on what is happening. – M. Deinum Feb 08 '23 at 08:04
  • @M.Deinum Using the loggin level you suggested I can see the log from my last edit, so I can now confidently say that spring is reaching the readiness state but Tomcat (?) is not binding the port correctly. – Ricardo Rubik Ruiz Feb 08 '23 at 22:36
  • Did the exposed ports change? Here is nothing that should prevent it, so I suggest you go back and check what has changed between the working version and this one. – M. Deinum Feb 09 '23 at 08:00

1 Answers1

1

remove -javaagent:/workspace/WEB-INF/lib/honeycomb-opentelemetry-javaagent-1.0.0.jar

edit: Looks like adding -Dotel.instrumentation.tomcat.enabled=false will also fix the issue.

Gentoli
  • 54
  • 4
  • Thanks to stack overflow I can't comment on the question. But my spring boot app did start to have 404 on both actuator and the web parts after adding opentelemetry-javaagent – Gentoli Feb 09 '23 at 19:21
  • This is the reason, the pod gets ready with the latest tomcat if I either remove or add the new argument. Thanks! – Ricardo Rubik Ruiz Feb 15 '23 at 18:22
  • Isn't that a bug on opentelemetry-javaagent though? What if we want auto-instrumentation on the tomcat library? – Kirit Mar 31 '23 at 18:47
  • I agree with @kirit I'm having the same problem, but removing the agent is not an option. – Davi Cavalcanti Apr 11 '23 at 04:10