0

currently I am setting up a bare metal kubernetes cluster containing two nodes with metallb as load balancer.
The ingress I am using is nginx also setup via helm: ```helm install nginx nginx/nginx```
I setup cert-manager via helm: ```helm install cert-manager jetstack/cert-manager -n cert-manager```

To get https working I followed the instructions of [cert-manager][1].
Unfortunately following these instructions does not seem to work because I get the Error: 'Waiting for HTTP-01 challenge propagation: wrong status code '404', expected '200'' when deploying kuard.

A curl to the Pod of acme via IP returns a 200 with a token:

curl -I -H "Host: example.mydns.dev" 10.96.151.217:8089/.well-known/acme-challenge/aDEelPosRNx9HoA3QkTOPRNbWCK8UjOkszdtCh7Wogw
HTTP/1.1 200 OK
Cache-Control: no-cache, no-store, must-revalidate
Date: Fri, 07 Oct 2022 16:42:44 GMT
Content-Length: 87
Content-Type: text/plain; charset=utf-8

A curl to the Pod via DNS does return a 308:

❯ curl -I -H "Host: example.mydns.dev" example.mydns.dev/.well-known/acme-challenge/aDEelPosRNx9HoA3QkTOPRNbWCK8UjOkszdtCh7Wogw
HTTP/1.1 308 Permanent Redirect
Location: https://example.mydns.dev/.well-known/acme-challenge/aDEelPosRNx9HoA3QkTOPRNbWCK8UjOkszdtCh7Wogw
Date: Fri, 07 Oct 2022 16:43:46 GMT
Content-Length: 18
Content-Type: text/plain; charset=utf-8

My guess is that there is a misconfiguration within ingress nginx. This is some console output:

❯ kubectl get ingress
NAME                        CLASS    HOSTS                    ADDRESS        PORTS     AGE
cm-acme-http-solver-2j9p5   <none>   example.mydns.dev   192.168.69.0   80        13m
kuard                       <none>   example.mydns.dev   192.168.69.0   80, 443   13m

kubectl describe ingress
Name:             cm-acme-http-solver-2j9p5
Labels:           acme.cert-manager.io/http-domain=1704593603
                  acme.cert-manager.io/http-token=1120145148
                  acme.cert-manager.io/http01-solver=true
Namespace:        default
Address:          192.168.69.0
Ingress Class:    <none>
Default backend:  <default>
Rules:
  Host                    Path  Backends
  ----                    ----  --------
  example.mydns.dev
                          /.well-known/acme-challenge/aDEelPosRNx9HoA3QkTOPRNbWCK8UjOkszdtCh7Wogw   cm-acme-http-solver-x6659:8089 (10.44.0.3:8089)
Annotations:              kubernetes.io/ingress.class: nginx
                          nginx.ingress.kubernetes.io/whitelist-source-range: 0.0.0.0/0,::/0
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    13m (x2 over 13m)  nginx-ingress-controller  Scheduled for sync


Name:             kuard
Labels:           <none>
Namespace:        default
Address:          192.168.69.0
Ingress Class:    <none>
Default backend:  <default>
TLS:
  example-tls terminates example.mydns.dev
Rules:
  Host                    Path  Backends
  ----                    ----  --------
  example.mydns.dev
                          /   kuard:80 (10.32.0.7:8080)
Annotations:              cert-manager.io/cluster-issuer: letsencrypt-staging
                          kubernetes.io/ingress.class: nginx
Events:
  Type    Reason             Age                From                       Message
  ----    ------             ----               ----                       -------
  Normal  CreateCertificate  13m                cert-manager-ingress-shim  Successfully created Certificate "example-tls"
  Normal  Sync               13m (x2 over 13m)  nginx-ingress-controller   Scheduled for sync

These are the files I used:

ClusterIssuer:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    # The ACME server URL
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: info@mydns.dev
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-staging
    # Enable the HTTP-01 challenge provider
    solvers:
    - http01:
        ingress:
          class:  nginx

Kuard Ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: kuard
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "letsencrypt-staging"

spec:
  tls:
  - hosts:
    - example.mydns.dev
    secretName: example-tls
  rules:
  - host: example.mydns.dev
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: kuard
            port:
              number: 80

Here are some debug outputs:

❯ kubectl get certificate
NAME          READY   SECRET        AGE
example-tls   False   example-tls   2m19s

❯ kubectl get orders -o wide
NAME                           STATE     ISSUER                REASON   AGE
example-tls-42np9-1759938310   pending   letsencrypt-staging            2m22s

❯ kubectl get challenge -o wide
NAME                                     STATE     DOMAIN                   REASON                                                                               AGE
example-tls-42np9-1759938310-206991323   pending   example.mydns.dev   Waiting for HTTP-01 challenge propagation: wrong status code '404', expected '200'   2m31s

❯ kubectl get ingress
NAME                        CLASS    HOSTS                    ADDRESS        PORTS     AGE
cm-acme-http-solver-2j9p5   <none>   example.mydns.dev   192.168.69.0   80        13m
kuard                       <none>   example.mydns.dev   192.168.69.0   80, 443   13m

❯ kubectl describe secret example-tls-x45jn
Name:         example-tls-x45jn
Namespace:    default
Labels:       cert-manager.io/next-private-key=true
Annotations:  <none>

Type:  Opaque

Data
====
tls.key:  1704 bytes

Any hints or tips would be appreciated! Thank you [1]: https://cert-manager.io/docs/tutorials/acme/nginx-ingress/

Robert Fent
  • 195
  • 1
  • 11

2 Answers2

0

Solution: I setup a new cluster and disabled apparmor in Ubuntu 22.04 LTS.

sudo sysctl stop apparmor
sudo sysctl disable apparmor
sudo sysctl restart containerd

Everything is working now as intended!

Robert Fent
  • 195
  • 1
  • 11
0

Just trying to share my experience with this issue which I have been able to solve. I have basically the same setup as you with:

I could access to my challenge from external addresses and still got the same error as you. Your assumption was good (and help me a lot to understand this mess), the error is from a pod trying to access the challenge dns path. In a bare-metal configuration most likely you are behind a firewall performing NAT and in this particular case, it is a NAT loop (and some firewall doesn't support it, ex: pfsense).

Just to elaborate about this error caused by the preflight check:

  • First, this check is performed by a pod (created/managed by cert-manager) from the cluster itself
  • Second, the NAT loopback happens because the pod managed by cert-manager is trying to access the challenge url (something like /.well-known/acme-challenge/xxxxxxxxxxxxxxxxxxxxxxxxx) by resolving the to-be certified URL and get a public address which will beNATed to internet and be routed to your firewall which will nat to your cluster LB (aka the NAT loop).

POD => Firewall => NAT => Internet => NAT => firewall => LOAD BALANCER is the loop causing the trouble.

The solution:

  1. Having a local dns (in my case my firewall) resolving the toBe certified domain to the loadbalancer IP (from the range of Metallb).
  2. Use that dns to resolve your domain in your internal cluster with this configuration: https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/ and a good kubectl edit configmap coredns -n kube-system
  3. delete your ingress (to delete all related pods and be sure they use the new dns settings)

This has worked for me and with a little luck (since I don't understand the half of it) you should be done

ps: just posting here my solution since I didn't find it elsewhere and lost 3 days... sorry for my poor english.

Axfalt
  • 1
  • 1