10

I'm building a container to tune kernel settings for a load balancer. I'd prefer to deploy those changes to the host in an image using a single privileged container. For example:

docker run --rm --privileged ubuntu:latest sysctl -w net.core.somaxconn=65535

In testing the changes take effect but only for that container. I was under the impression that with a fully privileged container changes to /proc would actually change the underlying OS.

$docker run --rm --privileged ubuntu:latest \
    sysctl -w net.core.somaxconn=65535
net.core.somaxconn = 65535

$ docker run --rm --privileged ubuntu:latest \
    /bin/bash -c "sysctl -a | grep somaxconn"
net.core.somaxconn = 128

Is this how privileged containers are supposed to work?

Am I just doing something silly?

What is the best way to make lasting changes?

Version Info:

Client version: 1.4.1
Client API version: 1.16
Go version (client): go1.3.3
Git commit (client): 5bc2ff8
OS/Arch (client): linux/amd64
Server version: 1.4.1
Server API version: 1.16
Go version (server): go1.3.3
Git commit (server): 5bc2ff8

Example command with mounted /proc:

$ docker run -v /proc:/proc ubuntu:latest \
    /bin/bash -c "sysctl -a | grep local_port"
net.ipv4.ip_local_port_range = 32768    61000

$ docker run -v /proc:/proc --privileged ubuntu:latest \
    /bin/bash -c "sysctl -p /updates/sysctl.conf"
net.ipv4.ip_local_port_range = 2000 65000

$ docker run -v /proc:/proc ubuntu:latest \
    /bin/bash -c "sysctl -a | grep local_port"
net.ipv4.ip_local_port_range = 32768    61000

$ docker run -v /proc:/proc --privileged ubuntu:latest \
    /bin/bash -c "sysctl -a | grep local_port"
net.ipv4.ip_local_port_range = 32768    61000
allingeek
  • 262
  • 1
  • 2
  • 10

4 Answers4

8

This particular setting falls under the influence of the network namespace that docker runs in.

As a general rule /proc does alter settings that are relevent systemwide, technically speaking however you are altering settings in /proc/net which returns results on a per network namespace basis.

Note that /proc/net is actually a symlink to /proc/self/net as it really does reflect the settings of the namespace that you are doing the work in.

Matthew Ife
  • 23,357
  • 3
  • 55
  • 72
  • So, if I make this changes in a container with /proc mounted and --net host, I can make changes to the host. But if I understand your answer, subsequent containers will maintain the old values (bootstrapped from the host's persisted settings) in its own namespace. I'd need to run that container with something like CAP_NET_ADMIN to make the same changes at runtime in the load balancer's container. Sound right? – allingeek Feb 03 '15 at 12:34
  • Yes, running with CAP_NET_ADMIN shouldn't pose an issue where you have instantiated a namespace for it. – Matthew Ife Feb 03 '15 at 13:33
  • Matthew_Ife Not an issue in this case that the container is expected to be privileged. It seems to me that CAP_NET_ADMIN could allow escaping from docker confination (at least the container could reconfigure its interface to impersonate another container) – Ángel Feb 03 '15 at 13:45
  • @Angel That would depend on what link out is setup inside of docker. Generally one should put enforcement of traffic in the parent namespace though. It would not be possible to switch namespaces into somewhere else since you need CAP_SYS_ADMIN for that. – Matthew Ife Feb 03 '15 at 15:06
  • SO using --net=host will work? – Jairo Andres Velasco Romero Dec 09 '19 at 23:20
7

Docker 1.12+ has native support for tweaking sysctl values inside the containers. Here is an excerpt from the documentation:

Configure namespaced kernel parameters (sysctls) at runtime

The --sysctl sets namespaced kernel parameters (sysctls) in the container. For example, to turn on IP forwarding in the containers network namespace, run this command:

docker run --sysctl net.ipv4.ip_forward=1 someimage

Using your example, the correct way to raise net.core.somaxconn would be:

docker run ... --sysctl net.core.somaxconn=65535 ...
hyperknot
  • 701
  • 2
  • 9
  • 16
4

The privileged container is still using its own process namespace for /proc. What you can do is to mount the real /proc inside the container:

docker run --rm --privileged -v /proc:/host-proc ubuntu:latest \
  'echo 65535 > /host-proc/sys/net/core/somaxconn'
Ángel
  • 852
  • 4
  • 6
  • Just tried this and it doesn't work. – allingeek Feb 03 '15 at 12:08
  • From the little I know about docker; it is supposed to be a self-contained instance, like a jail on FreeBSD, so it can be easily moved around, redeployed, etc... You should not mix up the docklet with the host OS. – DutchUncle Feb 03 '15 at 12:11
  • 2
    There are several valid cases for using --privileged containers and this seems like the perfect case. All containers use the same underlying kernel. Standard containers mount /proc as read only. – allingeek Feb 03 '15 at 12:15
  • @allingeek CAP_NET_ADMIN may indeed be the missing bit. – Ángel Feb 03 '15 at 13:45
  • 1
    Tried with NET_ADMIN and it still doesn't work - docker run --cap-add NET_ADMIN --net=host -v /proc:/proc_host ubuntu:14.04 bash -c 'echo 1 >/proc_host/sys/net/ipv4/ip_forward' && sysctl net.ipv4.ip_forward net.ipv4.ip_forward = 0 – tomdee Apr 01 '15 at 20:05
2

This works for me with Docker 1.5.0:

docker run --privileged --net=host --rm ubuntu:latest /bin/sh -c \
   'echo 65535 > /proc/sys/net/core/somaxconn'