0

I have vault deployed from the official helm chart and it's running in HA mode, with auto-unseal, TLS enabled, raft as the backend, and the cluster is 1.17 in EKS. I have all of the raft followers joined to the vault-0 pod as the leader. I have followed this tutorial to the tee and I always end up with tls bad certificate. http: TLS handshake error from 123.45.6.789:52936: remote error: tls: bad certificate is the exact error.

I did find an issue with following this tutorial exactly. The part where they pipe the kubernetes CA to base64. For me this was multi-line and failed to deploy. So I pipped that output to tr -d '\n'. But this is where I get this error. I've tried the part of launching a container and testing it with curl, and it fails, then tailing the agent injector logs, I get that bad cert error.

Here is my values.yaml if it helps.

global:
  tlsDisable: false

injector:
  metrics:
    enabled: true

  certs:
    secretName: vault-tls
    caBundle: "(output of cat vault-injector.ca | base64 | tr -d '\n')"
    certName: vault.crt
    keyName: vault.key

server:
  extraEnvironmentVars:
    VAULT_CACERT: "/vault/userconfig/vault-tls/vault.ca"

  extraSecretEnvironmentVars:
    - envName: AWS_ACCESS_KEY_ID
      secretName: eks-creds
      secretKey: AWS_ACCESS_KEY_ID
    - envName: AWS_SECRET_ACCESS_KEY
      secretName: eks-creds
      secretKey: AWS_SECRET_ACCESS_KEY
    - envName: VAULT_UNSEAL_KMS_KEY_ID
      secretName: vault-kms-id
      secretKey: VAULT_UNSEAL_KMS_KEY_ID

  extraVolumes:
    - type: secret
      name: vault-tls
    - type: secret
      name: eks-creds
    - type: secret
      name: vault-kms-id

  resources:
    requests:
      memory: 256Mi
      cpu: 250m
    limits:
      memory: 512Mi
      cpu: 500m

  auditStorage:
    enabled: true
    storageClass: gp2

  standalone:
    enabled: false

  ha:
    enabled: true

    raft:
      enabled: true

      config: |
        ui = true

        api_addr = "[::]:8200"
        cluster_addr = "[::]:8201"

        listener "tcp" {
          tls_disable = 0
          tls_cert_file = "/vault/userconfig/vault-tls/vault.crt"
          tls_key_file = "/vault/userconfig/vault-tls/vault.key"
          tls_client_ca_file = "/vault/userconfig/vault-tls/vault.ca"
          tls_min_version = "tls12"
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }

        storage "raft" {
          path = "/vault/data"
        }

        disable_mlock = true
        service_registration "kubernetes" {}

        seal "awskms" {
          region     = "us-east-1"
          kms_key_id = "VAULT_UNSEAL_KMS_KEY_ID"
        }

ui:
  enabled: true

I've exec'd into the agent-injector and poked around. I can see the /etc/webhook/certs/ are there and they look correct.

Here is my vault-agent-injector pod

kubectl describe pod vault-agent-injector-6bbf84484c-q8flv
Name:         vault-agent-injector-6bbf84484c-q8flv
Namespace:    default
Priority:     0
Node:         ip-172-16-3-151.ec2.internal/172.16.3.151
Start Time:   Sat, 19 Dec 2020 16:27:14 -0800
Labels:       app.kubernetes.io/instance=vault
              app.kubernetes.io/name=vault-agent-injector
              component=webhook
              pod-template-hash=6bbf84484c
Annotations:  kubernetes.io/psp: eks.privileged
Status:       Running
IP:           172.16.3.154
IPs:
  IP:           172.16.3.154
Controlled By:  ReplicaSet/vault-agent-injector-6bbf84484c
Containers:
  sidecar-injector:
    Container ID:  docker://2201b12c9bd72b6b85d855de6917548c9410e2b982fb5651a0acd8472c3554fa
    Image:         hashicorp/vault-k8s:0.6.0
    Image ID:      docker-pullable://hashicorp/vault-k8s@sha256:5697b85bc69aa07b593fb2a8a0cd38daefb5c3e4a4b98c139acffc9cfe5041c7
    Port:          <none>
    Host Port:     <none>
    Args:
      agent-inject
      2>&1
    State:          Running
      Started:      Sat, 19 Dec 2020 16:27:15 -0800
    Ready:          True
    Restart Count:  0
    Liveness:       http-get https://:8080/health/ready delay=1s timeout=5s period=2s #success=1 #failure=2
    Readiness:      http-get https://:8080/health/ready delay=2s timeout=5s period=2s #success=1 #failure=2
    Environment:
      AGENT_INJECT_LISTEN:              :8080
      AGENT_INJECT_LOG_LEVEL:           info
      AGENT_INJECT_VAULT_ADDR:          https://vault.default.svc:8200
      AGENT_INJECT_VAULT_AUTH_PATH:     auth/kubernetes
      AGENT_INJECT_VAULT_IMAGE:         vault:1.5.4
      AGENT_INJECT_TLS_CERT_FILE:       /etc/webhook/certs/vault.crt
      AGENT_INJECT_TLS_KEY_FILE:        /etc/webhook/certs/vault.key
      AGENT_INJECT_LOG_FORMAT:          standard
      AGENT_INJECT_REVOKE_ON_SHUTDOWN:  false
      AGENT_INJECT_TELEMETRY_PATH:      /metrics
    Mounts:
      /etc/webhook/certs from webhook-certs (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from vault-agent-injector-token-k8ltm (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  webhook-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vault-tls
    Optional:    false
  vault-agent-injector-token-k8ltm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vault-agent-injector-token-k8ltm
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                                   Message
  ----    ------     ----  ----                                   -------
  Normal  Scheduled  40m   default-scheduler                      Successfully assigned default/vault-agent-injector-6bbf84484c-q8flv to ip-172-16-3-151.ec2.internal
  Normal  Pulled     40m   kubelet, ip-172-16-3-151.ec2.internal  Container image "hashicorp/vault-k8s:0.6.0" already present on machine
  Normal  Created    40m   kubelet, ip-172-16-3-151.ec2.internal  Created container sidecar-injector
  Normal  Started    40m   kubelet, ip-172-16-3-151.ec2.internal  Started container sidecar-injector

My vault deployment

kubectl describe deployment vault
Name:                   vault-agent-injector
Namespace:              default
CreationTimestamp:      Sat, 19 Dec 2020 16:27:14 -0800
Labels:                 app.kubernetes.io/instance=vault
                        app.kubernetes.io/managed-by=Helm
                        app.kubernetes.io/name=vault-agent-injector
                        component=webhook
Annotations:            deployment.kubernetes.io/revision: 1
Selector:               app.kubernetes.io/instance=vault,app.kubernetes.io/name=vault-agent-injector,component=webhook
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app.kubernetes.io/instance=vault
                    app.kubernetes.io/name=vault-agent-injector
                    component=webhook
  Service Account:  vault-agent-injector
  Containers:
   sidecar-injector:
    Image:      hashicorp/vault-k8s:0.6.0
    Port:       <none>
    Host Port:  <none>
    Args:
      agent-inject
      2>&1
    Liveness:   http-get https://:8080/health/ready delay=1s timeout=5s period=2s #success=1 #failure=2
    Readiness:  http-get https://:8080/health/ready delay=2s timeout=5s period=2s #success=1 #failure=2
    Environment:
      AGENT_INJECT_LISTEN:              :8080
      AGENT_INJECT_LOG_LEVEL:           info
      AGENT_INJECT_VAULT_ADDR:          https://vault.default.svc:8200
      AGENT_INJECT_VAULT_AUTH_PATH:     auth/kubernetes
      AGENT_INJECT_VAULT_IMAGE:         vault:1.5.4
      AGENT_INJECT_TLS_CERT_FILE:       /etc/webhook/certs/vault.crt
      AGENT_INJECT_TLS_KEY_FILE:        /etc/webhook/certs/vault.key
      AGENT_INJECT_LOG_FORMAT:          standard
      AGENT_INJECT_REVOKE_ON_SHUTDOWN:  false
      AGENT_INJECT_TELEMETRY_PATH:      /metrics
    Mounts:
      /etc/webhook/certs from webhook-certs (ro)
  Volumes:
   webhook-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  vault-tls
    Optional:    false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   vault-agent-injector-6bbf84484c (1/1 replicas created)
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  46m   deployment-controller  Scaled up replica set vault-agent-injector-6bbf84484c to 1

What else can I check and verify or troubleshoot in order to figure out why the agent injector is causing this error?

Byron Mansfield
  • 613
  • 1
  • 6
  • 19

0 Answers0