2

Although I have not added any iptables rules on the host or the two containers, packets from one docker container are modified and given the IP of the docker network gateway:

Container 1:

bash-5.0# ip route
default via 172.16.238.2 dev eth0
10.6.0.0/24 via 172.16.238.1 dev eth0
172.16.238.0/24 dev eth0 scope link  src 172.16.238.7

bash-5.0# ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    23: eth0@if24: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
        link/ether 02:42:ac:10:ee:07 brd ff:ff:ff:ff:ff:ff
        inet 172.16.238.7/24 brd 172.16.238.255 scope global eth0
           valid_lft forever preferred_lft forever

bash-5.0# ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1): 56 data bytes
--- 1.1.1.1 ping statistics ---
7 packets transmitted, 0 packets received, 100% packet loss

Container 2:

root@c8d6fa7eab4d:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
3: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN group default qlen 1000
    link/none
    inet 100.71.37.47/32 scope global wg0
       valid_lft forever preferred_lft forever
17: eth0@if18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:10:ee:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.238.2/24 brd 172.16.238.255 scope global eth0
       valid_lft forever preferred_lft forever

root@c8d6fa7eab4d:/# tcpdump -i eth0 dst 1.1.1.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
16:34:06.910548 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 36, length 64
16:34:07.910920 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 37, length 64
16:34:08.911322 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 38, length 64
16:34:09.911709 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 39, length 64
16:34:10.912143 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 40, length 64
16:34:11.912504 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 41, length 64
16:34:12.912932 IP 172.16.238.1 > one.one.one.one: ICMP echo request, id 5632, seq 42, length 64
^C
7 packets captured
9 packets received by filter
0 packets dropped by kernel

Host:

root@raspberrypi:~# iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  -- !172.16.238.0/24      172.16.238.2
MASQUERADE  all  -- !172.16.238.0/24      172.16.238.2
MASQUERADE  all  --  172.17.0.0/16        anywhere
MASQUERADE  all  --  172.16.238.0/24      anywhere
MASQUERADE  all  -- !172.16.238.0/24      172.16.238.2
MASQUERADE  all  --  10.6.0.0/24          anywhere             /* wireguard-nat-rule */
MASQUERADE  all  -- !172.16.238.0/24      172.16.238.2
MASQUERADE  all  --  172.16.238.0/24      anywhere
MASQUERADE  tcp  --  172.16.238.4         172.16.238.4         tcp dpt:https
MASQUERADE  tcp  --  172.16.238.4         172.16.238.4         tcp dpt:http
MASQUERADE  tcp  --  172.16.238.4         172.16.238.4         tcp dpt:domain
MASQUERADE  udp  --  172.16.238.4         172.16.238.4         udp dpt:domain
MASQUERADE  tcp  --  172.16.238.5         172.16.238.5         tcp dpt:https
MASQUERADE  tcp  --  172.16.238.5         172.16.238.5         tcp dpt:http
MASQUERADE  tcp  --  172.16.238.5         172.16.238.5         tcp dpt:domain
MASQUERADE  udp  --  172.16.238.5         172.16.238.5         udp dpt:domain

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
DOCKER     all  --  anywhere            !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain DOCKER (2 references)
target     prot opt source               destination
RETURN     all  --  anywhere             anywhere
RETURN     all  --  anywhere             anywhere
DNAT       tcp  --  anywhere             192.168.178.2        tcp dpt:https to:172.16.238.4:443
DNAT       tcp  --  anywhere             192.168.178.2        tcp dpt:http to:172.16.238.4:80
DNAT       tcp  --  anywhere             192.168.178.2        tcp dpt:domain to:172.16.238.4:53
DNAT       udp  --  anywhere             192.168.178.2        udp dpt:domain to:172.16.238.4:53
DNAT       tcp  --  anywhere             192.168.178.3        tcp dpt:https to:172.16.238.5:443
DNAT       tcp  --  anywhere             192.168.178.3        tcp dpt:http to:172.16.238.5:80
DNAT       tcp  --  anywhere             192.168.178.3        tcp dpt:domain to:172.16.238.5:53
DNAT       udp  --  anywhere             192.168.178.3        udp dpt:domain to:172.16.238.5:53
# Warning: iptables-legacy tables present, use iptables-legacy to see them

root@raspberrypi:~# iptables-legacy -S -t nat
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
Trigus
  • 71
  • 8
  • You can get rid of masquerading by adding the following flag to docker network create `--opt com.docker.network.bridge.enable_ip_masquerade=false` Unfortunately, you then seem to lose access to the internet from the containers attached to that network... – Yann 4201 Oct 12 '22 at 16:21
  • One solution may be to attach the containers that need internet access to a second network with masquerading enabled... – Yann 4201 Oct 12 '22 at 16:30

2 Answers2

2

Inter-Container Connectivity (ICC) is enabled by default and it links containers together automatically without using --link or defining a network.

If you're looking to disable it, set icc: false in your Docker daemon configuration (/etc/daemon.json)

Left undefined or set to true makes the Docker deamon create connected networks along with container creation. These networks are the source of the iptables rules you see.

See more in the official Docker bridge networking tutorial.

JeffRSon
  • 103
  • 3
micke
  • 121
  • 6
  • I actually added the to the same bridge network on purpose. But I don't understand why when pinging one container on that network, the ICMP packets appear to originate from the bridge networks gateway rather than the container directly. I suspect that the problem are those rules: "MASQUERADE all -- 172.16.238.0/24 anywhere" but I don't know why they are set in the first place. – Trigus Mar 02 '21 at 22:09
1

I had the same issue and for me, I believe, this is related to:

https://github.com/moby/moby/issues/43440

The problem is that I create a Docker network, then remove it and create another one again. Docker is smart enough to reuse the same IP range (172.18.0.0/16 in my case) but firewalld seems to keep track of the former Docker network:

# iptables -t nat -S 
...
-A POSTROUTING -s 172.18.0.0/16 ! -o br-4a99e748fcc1 -j MASQUERADE
-A POSTROUTING -s 172.18.0.0/16 ! -o br-9dbbf26e610f -j MASQUERADE
...

Where br-4a99e748fcc1 is indeed the existing interface but br-9dbbf26e610f is the remaining one... (deleted but not permanently suppressed from firewalld).

# ip add show br-9dbbf26e610f
Device "br-9dbbf26e610f" does not exist.

If I remove the wrong line, everything is fine again: the NAT hairpinning (source IP replaced by the gateway address) no longer happens:

# iptables -t nat -D POSTROUTING -s 172.18.0.0/16 ! -o br-9dbbf26e610f -j MASQUERADE

This perfectly makes sense: the rule says that any packet

  • originating from source range 172.18.0.0/16
  • that does not go out through br-9dbbf26e610f

should go through masquerade... and of course, no packet goes out through that non-existing interface (!) so this causes masquerading all IPs in your Docker network.

EDIT: firewall-cmd --reload creates the rules again!

As explained in the Docker issue above, I finally ended up calling firewall-cmd to remove zombie interfaces from the docker zone. This had to be done while the Docker daemon was down because it seemes to keep track of these zombie interfaces otherwise...

systemctl stop docker

for interface in $(firewall-cmd --zone=docker --list-interfaces)
do
    if ! ip link ls "${interface}" >/dev/null 2>&1
    then
        firewall-cmd --zone=docker --remove-interface="${interface}"
        firewall-cmd --runtime-to-permanent
        firewall-cmd --reload
    fi
done

systemctl start docker
Yann 4201
  • 111
  • 3