This is a puzzler, and I'm hoping that by writing a StackOverflow question, I gain some fresh insights.
In a nutshell, I'm trying to figure out why I can access https://sts.nih.gov from a host machine, but not from a docker container on the same host when other sites work just fine
How I reproduce the problem...
I have a cloud-based machine (Digital Ocean) which can happily establish a https connection to sts.nih.gov
# from host machine
curl -vv -o /tmp/test https://sts.nih.gov
If I get a shell on a fresh docker container, I cannot access that site
# get a shell within a container
docker run -ti ubuntu:18.04 /bin/bash
# attempt same request...
curl -vv --ipv4 -o /tmp/test https://sts.nih.gov
* Rebuilt URL to: https://sts.nih.gov/
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 128.231.243.251...
* TCP_NODELAY set
0 0 0 0 0 0 0 0 --:--:-- 0:00:31 --:--:-- 0* connect to 128.231.243.251 port 443 failed: Connection timed out
* Failed to connect to sts.nih.gov port 443: Connection timed out
* Closing connection 0
curl: (7) Failed to connect to sts.nih.gov port 443: Connection timed out
Now one interesting thing is that without the --ipv4
flag, the command was attempting to use ipv6 and failing.
Does this happen for all accesses to external hosts?
Nope, within a docker container, curl -o /tmp/test https://serverfault.com/
works just fine, for example.
Is it a DNS problem?
No, nslookup is able to resolve the address within the container
nslookup sts.nih.gov
Server: 67.207.67.3
Address: 67.207.67.3#53
Non-authoritative answer:
sts.nih.gov canonical name = sts.ha.nih.gov.
Name: sts.ha.nih.gov
Address: 128.231.243.251
Name: sts.ha.nih.gov
Address: 2607:f220:404:9124:128:231:243:251
I can attempt to use an IP address in the request too
curl -vv -o /tmp/test https://128.231.243.251
Same result - a timeout.
Is it specific to https?
No, this seems to be a TCP/IP issue rather than an https protocol issue. Just using netcat to check the connectivity fails.
netcat -zvn 128.231.243.251 443
(UNKNOWN) [128.231.243.251] 443 (?) : Connection timed out
Is it a routing issue?
It doesn't seem to be - after all, the host can access the problem site, and the docker container can access other external sites.
Traceroute shows ICMP packets at least are reaching the target network
traceroute 128.231.243.251
traceroute to 128.231.243.251 (128.231.243.251), 30 hops max, 60 byte packets
1 172.17.0.1 (172.17.0.1) 0.063 ms 0.029 ms 0.023 ms
2 * * *
3 10.80.5.46 (10.80.5.46) 1.758 ms 10.80.5.48 (10.80.5.48) 1.864 ms 10.80.5.38 (10.80.5.38) 4.499 ms
4 138.197.249.112 (138.197.249.112) 1.991 ms 138.197.249.122 (138.197.249.122) 2.179 ms 138.197.249.104 (138.197.249.104) 1.961 ms
5 138.197.251.136 (138.197.251.136) 1.659 ms 138.197.251.142 (138.197.251.142) 1.846 ms 138.197.251.138 (138.197.251.138) 1.799 ms
6 212.187.195.149 (212.187.195.149) 4.005 ms 212.187.195.85 (212.187.195.85) 1.800 ms 1.743 ms
7 * * *
8 4.16.68.166 (4.16.68.166) 76.945 ms 76.901 ms 76.869 ms
9 bth-tic-core-rt-a-te-0-0-0-0.net.nih.gov (156.40.93.1) 77.783 ms 77.754 ms 77.632 ms
10 156.40.93.170 (156.40.93.170) 76.519 ms 76.473 ms 76.429 ms
11 156.40.93.171 (156.40.93.171) 77.745 ms 76.627 ms 77.020 ms
12 * * *
...
30 * * *
I can also show a good trace using TCP SYN packages
traceroute --tcp 128.231.243.251
traceroute to 128.231.243.251 (128.231.243.251), 30 hops max, 60 byte packets
1 172.17.0.1 (172.17.0.1) 0.066 ms 0.017 ms 0.017 ms
2 * * *
3 10.80.5.34 (10.80.5.34) 1.881 ms 10.80.5.46 (10.80.5.46) 2.113 ms 10.80.5.36 (10.80.5.36) 1.832 ms
4 138.197.249.98 (138.197.249.98) 3.127 ms 138.197.249.120 (138.197.249.120) 1.978 ms 138.197.249.106 (138.197.249.106) 1.853 ms
5 138.197.251.140 (138.197.251.140) 1.784 ms 1.826 ms 138.197.251.132 (138.197.251.132) 1.705 ms
6 212.187.195.149 (212.187.195.149) 2.859 ms 1.457 ms 1.389 ms
7 * * *
8 4.16.68.166 (4.16.68.166) 76.470 ms 76.446 ms 76.520 ms
9 bth-tic-core-rt-a-te-0-0-0-0.net.nih.gov (156.40.93.1) 77.602 ms 77.582 ms 77.492 ms
10 156.40.93.170 (156.40.93.170) 76.005 ms 76.733 ms 76.459 ms
11 * * *
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 128.231.243.251 (128.231.243.251) 77.268 ms 77.215 ms 76.815 ms
next moves?
At this point, I'm baffled as to how to narrow this down further. To me, it feels like there's something about the networking at the remote end which is unusual but only manifests itself within docker's networking mechanisms.