7

My goal is to have a routable public IPv6 address for each of my docker containers. I want to be able to connect into and out of my containers using the IPv6 protocol.

I'm using Linode and I've been assigned a public IPv6 pool:

2600:3c01:e000:00e2:: / 64 routed to 2600:3c01::f03c:91ff:feae:d7d7

That "routed to" address was auto-configured by dhcp:

# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
       valid_lft 2591987sec preferred_lft 604787sec
    inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
       valid_lft forever preferred_lft forever

I setup an AAAA record for ipv6.daaku.org to make it easier to work with:

# nslookup -q=AAAA ipv6.daaku.org
ipv6.daaku.org  has AAAA address 2600:3c01:e000:e2::1

To test, I assigned that address manually:

# ip -6 addr add 2600:3c01:e000:00e2::1/64 dev eth0
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2600:3c01:e000:e2::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
       valid_lft 2591984sec preferred_lft 604784sec
    inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
       valid_lft forever preferred_lft forever

I can now ping this from my IPv6 enabled home network:

# ping6 -c3 ipv6.daaku.org
PING6(56=40+8+8 bytes) 2601:9:400:12ab:1db7:a353:a7b4:c192 --> 2600:3c01:e000:e2::1
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=0 hlim=54 time=16.855 ms
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=1 hlim=54 time=19.506 ms
16 bytes from 2600:3c01:e000:e2::1, icmp_seq=2 hlim=54 time=17.467 ms

--- ipv6.daaku.org ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 16.855/17.943/19.506/1.133 ms

I removed the address because I want it in the container only, and went back to the original state:

# ip -6 addr del 2600:3c01:e000:00e2::1/64 dev eth0
# ip -6 addr show eth0
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qlen 1000
    inet6 2600:3c01::f03c:91ff:feae:d7d7/64 scope global mngtmpaddr dynamic
       valid_lft 2591987sec preferred_lft 604787sec
    inet6 fe80::f03c:91ff:feae:d7d7/64 scope link
       valid_lft forever preferred_lft forever

I started a docker container without a network in another terminal:

# docker run -it --rm --net=none debian bash
root@b96ea38f03b3:/#

Stuck it's pid in a variable for ease of use:

CONTAINER_PID=$(docker inspect -f '{{.State.Pid}}' b96ea38f03b3)

Setup the netns for that pid:

# mkdir -p /run/netns
# ln -s /proc/$CONTAINER_PID/ns/net /run/netns/$CONTAINER_PID

Created a new device, assigned it the IP:

# ip link add container0 link eth0 type macvlan
# ip link set container0 netns $CONTAINER_PID
# ip netns exec $CONTAINER_PID ip link set dev container0 name eth0
# ip netns exec $CONTAINER_PID ip link set eth0 up
# ip netns exec $CONTAINER_PID ip addr add 2600:3c01:e000:00e2::1/64 dev eth0

Back in the other terminal where I started the container:

# ip -6 addr show eth0
22: eth0@gre0: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500
    inet6 2600:3c01::a083:1eff:fea5:5ad2/64 scope global dynamic
       valid_lft 2591979sec preferred_lft 604779sec
    inet6 2600:3c01:e000:e2::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::a083:1eff:fea5:5ad2/64 scope link
       valid_lft forever preferred_lft forever

# ip -6 route
2600:3c01::/64 dev eth0  proto kernel  metric 256  expires 2591976sec
2600:3c01:e000:e2::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
default via fe80::1 dev eth0  proto ra  metric 1024  expires 67sec

This doesn't work and I can't connect out from the container (using ping6 ipv6.google.com as my test) nor can I ping the container from across the internet from my home network (using ping6 ipv6.daaku.org as my test).

Update: I managed to get outgoing IPv6 working by doing this:

ip -6 addr add 2600:3c01:e000:00e2::1111:1/112 dev docker0 &&
ip6tables -P FORWARD ACCEPT &&
sysctl -w net.ipv6.conf.all.forwarding=1 &&
sysctl -w net.ipv6.conf.all.proxy_ndp=1

CONTAINER_PID=$(docker inspect -f '{{.State.Pid}}' 4fd3b05a04bb)
mkdir -p /run/netns &&
ln -s /proc/$CONTAINER_PID/ns/net /run/netns/$CONTAINER_PID &&
ip netns exec $CONTAINER_PID ip -6 addr add 2600:3c01:e000:00e2::1111:20/112 dev eth0 &&
ip netns exec $CONTAINER_PID ip -6 route add default via 2600:3c01:e000:00e2::1111:1 dev eth0

IPv6 routes on the host:

# ip -6 r
2600:3c01::/64 dev eth0  proto kernel  metric 256  expires 2582567sec
2600:3c01:e000:e2::1111:0/112 dev docker0  proto kernel  metric 256
2600:3c01:e000:e2::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev docker0  proto kernel  metric 256
fe80::/64 dev veth1775864  proto kernel  metric 256
fe80::/64 dev veth102096c  proto kernel  metric 256
fe80::/64 dev vethdf3a55b  proto kernel  metric 256

IPv6 routes in the container:

# ip -6 r
2600:3c01:e000:e2::1111:0/112 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
default via 2600:3c01:e000:e2::1111:1 dev eth0  metric 1024

Still can't ping it from my home machine.

daaku
  • 211
  • 1
  • 7

3 Answers3

2

I think your problem is routing-related. The trouble is that you've been assigned a flat /64, but you've decided to sub-subnet off a /112. That's fine for outbound traffic, because your container host knows about all the individual sub-subnets, but when your ISP comes to handle the return packets, they don't know that you've subsectioned off 2600:3c01:e000:e2::1111:0/112 and that that should be routed via 2600:3c01:e000:00e2::1. They just expect the whole of 2600:3c01:e000:00e2::/64 to be sitting there, directly-connected and accessible via unicast.

The problem is that there's no mechanism to tell your ISP that you've decided to start sub-subnetting (actually, that's a lie, there are a number of ways - but they all require the cooperation of your ISP). Your simplest bet is probably to stop routing the traffic to the containers, and start bridging it.

I can't tell you exactly how to do that. I tried, and several people kindly pointed out that I was wrong. Hopefully someone can clarify. But the problem remains that you need to bridge your containers to your next-hop-route, and vice-versa, rather than routing them.

MadHatter
  • 79,770
  • 20
  • 184
  • 232
  • docker0 is a bridge, not an interface – Bryan Nov 22 '14 at 15:06
  • docker0 is already a bridge. Docker has a help page about [replacing it with your own bridge](https://docs.docker.com/articles/networking/#building-your-own-bridge). – Michael Hampton Nov 22 '14 at 15:15
  • Thanks, I'd suspected as much. The issue was what happens to traffic on its way from `docker0` to the world, and vice-versa, and I'm suggesting that it's **that** hop that should be bridged (rather than routed). It may be the right way to do that is to add `eth0` to the `docker0` bridge - as I said, I'm not expert on containerisation - but I'm fairly clear in my own mind about **what** the OP needs to do, even if I can't quite define **how** (s)he should do it. – MadHatter Nov 22 '14 at 15:35
  • I probably would have just routed the /64 to docker0. – Michael Hampton Nov 22 '14 at 15:49
  • @MichaelHampton I'm happy to route the entire /64 to docker0 too. MadHatter the /64 routes to 2600:3c01::f03c:91ff:feae:d7d7/64 -- can that be the other address on the eth0-internet link? – daaku Nov 22 '14 at 20:00
  • That is the address on the *far* end of the eth0-internet link, yes. Michael may have a point; I don't know if you can set it up like that, with the external address of eth0 being only link-local, but I suspect it to be possible, and if you're happier with that than with bridging it, it's certainly worth a try. The disadvantage may be that although the guests will now be able to talk v6 to the world, the host may no longer be able to do so, as its outward-facing interface will have a non-routable address on. – MadHatter Nov 23 '14 at 08:11
  • What does "They just expect the whole of 2600:3c01:e000:00e2::/64 to be sitting there" mean? Can the ISP send the packets in two different ways, where one can be routed by the host and one cannot? – jochen May 26 '15 at 21:07
  • Not so much that the packet *can't* be routed as that it *isn't* routed. Previous-hop inbound - which is the ISP's kit - has to know to route packets intended for the "inside" sub-subnet via the right address in the "outside" sub-subnet, rather than assuming there are no more hops required. – MadHatter May 31 '15 at 10:31
1

In docker 1.0, there are two options for enabling IPv6 connectivity to docker containers. I had to use the lxc driver rather than libcontainer to get both of these methods to work. You may be able to use RADVD. I didn't attempt it.

1) Have the provider route the /64 to your docker host. This is a easiest option. Enable IPv6 forwarding and assign the /64 to docker0. You don't have to break up this network into smaller ones (i.e., /112) unless you have multiple docker bridges or multiple docker hosts.

This method is discussed in depth in Andreas Neuhaus's blog post "IPv6 in Docker Containers". See http://zargony.com/2013/10/13/ipv6-in-docker-containers.

Note that very few IPv6-enabled IaaS providers will route a /64 to a VM. The second method overcomes this limitation in a semi-kludgy way.

2) Use a subset of the /64 from the LAN interface on the docker bridge - This method doesn't require a /64 routed to your docker host. A smaller network in the /64 (e.g., /112) on the LAN is assigned to docker0. NDP is configured to proxy NDP from the docker bridge to your LAN interface (probably eth0).

I wrote a detailed description of this method at http://jeffloughridge.wordpress.com/2014/07/22/ipv6-in-docker-containers-on-digitalocean/.

I haven't used docker versions greater than 1.0. It is possible that things have changed in newer releases.

Jeff Loughridge
  • 1,074
  • 2
  • 7
  • 18
0

By RFC, all addresses in a subnet are within a /64. Assignments within this space use a one ore more IPv6 addresses.

You are thinking like IPv6 is like IPv4, but bigger addresses. I am here to tell you if you design your systems that way, expect to increase your costs of setup, maintenance is security!

JoeKlein
  • 9
  • 1