0

I am running a set of services in a Docker environment. All services are behind the same nginx reverse proxy container that encrypts with letsencrypt and splits the incoming traffic based on subdomains.

Today all of a sudden (while I was tinkering with another service) my Nextcloud container started returning 502 Bad Gateway when accessed through the reverse proxy.

All other services are doing fine.

When inspecting the error.log that nginx logs these errors to I can see lots of this error:

512 connect() failed (111: Connection refused) while connecting to upstream

Leading me to believe something is wrong with the Nextcloud container instance.

So I checked the status of the container (I recently restarted the system, therefore Up 13 minutes):

docker ps -a | grep Nextcloud
6a4cd6dde4f6   nextcloud:21.0.1                      "/entrypoint.sh apac…"   About an hour ago   Up 13 minutes             80/tcp                                                               Nextcloud

Here all seems fine. So I checked the output of the container by running the docker-compose in the terminal (as opposed to running it as a daemon in the background), which gave me no new interesting output at all. My browser refreshes did not seem to reach the Nextcloud container at all.

After this I wanted to see if the Nextcould container was responsive at all, so I forwarded the host's port 5555 to the nextcloud container's port 80 and connected to the host IP directly on port 5555. This worked. I got the "Access through untrusted domain" page, which makes sense since I was accessing it straight throught the host's IP.

Ok, so the reverse proxy container is experiencing connection refused, and the Nextcloud container is not receiving any requests at all, but seems to be working fine other than that.

I then created a temporary Ubuntu troubleshooting container and connected it to the same docker network as the reverse proxy container and the Nextcloud container. After installing some tools, I ran these commands:

root@491b7ef0f34f:/# ping Nextcloud
PING Nextcloud (10.10.7.3) 56(84) bytes of data.
64 bytes from Nextcloud.couplernets_nextcloud (10.10.7.3): icmp_seq=1 ttl=64 time=0.127 ms
64 bytes from Nextcloud.couplernets_nextcloud (10.10.7.3): icmp_seq=2 ttl=64 time=0.080 ms
64 bytes from Nextcloud.couplernets_nextcloud (10.10.7.3): icmp_seq=3 ttl=64 time=0.083 ms
64 bytes from Nextcloud.couplernets_nextcloud (10.10.7.3): icmp_seq=4 ttl=64 time=0.085 ms
^C
--- Nextcloud ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3074ms
rtt min/avg/max/mdev = 0.080/0.093/0.127/0.019 ms

root@491b7ef0f34f:/# nmap -sV -p- Nextcloud
Starting Nmap 7.80 ( https://nmap.org ) at 2021-04-25 18:43 UTC
Nmap scan report for Nextcloud (10.10.7.3)
Host is up (0.0000090s latency).
rDNS record for 10.10.7.3: Nextcloud.couplernets_nextcloud
Not shown: 65534 closed ports
PORT   STATE SERVICE VERSION
80/tcp open  http    Apache httpd 2.4.38 ((Debian))
MAC Address: 02:42:0A:0A:07:03 (Unknown)
Service detection performed. Please report any incorrect results at https://nmap.org/submit/
Nmap done: 1 IP address (1 host up) scanned in 7.64 seconds

root@491b7ef0f34f:/# wget Nextcloud:80/index.html
--2021-04-25 18:45:28--  http://nextcloud/index.html
Resolving nextcloud (nextcloud)... 10.10.7.3
Connecting to nextcloud (nextcloud)|10.10.7.3|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 156 [text/html]
Saving to: 'index.html.1'

index.html.1        100%[===================>]     156  --.-KB/s    in 0s

2021-04-25 18:45:28 (7.66 MB/s) - 'index.html.1' saved [156/156]

This tells me that the Nextcloud instance should be fine and dandy since its port is open and I can access the index.html file with no problems.

I then went to check on my nginx reverse proxy configuration.

 server {
    listen 443 ssl;
    listen [::]:443 ssl;

    #Add HSTS preload header
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;

    #Remove revealing headers
    server_tokens off;
    proxy_hide_header X-Powered-By;

    server_name <cloud.domain.topdomain>;

    include /config/nginx/ssl.conf;

    client_max_body_size 0;

    location / {
        include /config/nginx/proxy.conf;
        proxy_max_temp_file_size 2048m;
        proxy_pass http://Nextcloud:80/;
    }
}

This configuration is just the same as all the other services that pass through the very same reverse proxy container. The only thing that differs is the server_name and the proxy_pass config parameters.

From here I have no idea what I should try next. Please help me. Any help is very much appreciated.

1 Answers1

0

Turns out the problem was the internal Docker container name to IP resolution that was acting weirdly.

When I attached to the reverse proxy container and ran ping Nextcloud the container resolved the IP of the Nextcloud container incorrectly. So whenever I tried to send anything to the Nextcloud container, the docker network layer incorrectly resolved the Nextloud container's IP and sent the traffic to the wrong container, therefore resulting in a refused connection.

My solution was to bring down the container that was incorrectly resolved. After that, my Nextcloud container was reachable using the hostname "Nextcloud".

What caused the problem remains unclear. I will come back here if I ever find out.