24

I am trying to change net.core.somaxconn for docker container to be able to have larger queue of requests for my web application.

On OS, outside docker, I first modify the property successfully:

$ cat /proc/sys/net/core/somaxconn
128
$ sudo sysctl -w net.core.somaxconn=1024
net.core.somaxconn = 1024
$ cat /proc/sys/net/core/somaxconn
1024

But then I don't know how to propagate that change into docker. I've tried:

  • Also editing /etc/sysctl.conf (in hope of docker reading that file on container launch)
  • Restarting containers sudo docker stop and sudo docker run again
  • Restarting the whole docker service by sudo service docker restart

But inside container, cat /proc/sys/net/core/somaxconn always shows 128.

I'm running docker 1.2 (so I cannot, by default, modify /proc attributes inside container) and in Elastic Beanstalk (so without --privileged mode, that would allow me to modify /proc).

How can I propagate the sysctl changes to docker?

Tuukka Mustonen
  • 4,722
  • 9
  • 49
  • 79

7 Answers7

41

The "net/core" subsys is registered per network namespace. And the initial value for somaxconn is set to 128.

When you do sysctl on the host system it sets the core parameters for its network namespace, which is the one owned by init. (basically this is the default namespace). This does not affect other network namespaces.

When a Docker container is started, the virtual network interface (shows up as vethXXX on the host) of that container is attached to its own namespace, which still has the initial somaxconn value of 128. So technically, you cannot propogate this value into the container, since the two network namespaces do not share it.

There are, however, two ways you can adjust this value, in addition to run the container in privileged mode.

  1. use "--net host" when running the container, so it uses the host's network interface and hence share the same network namespace.

  2. you can mount the proc file system as read-write using Docker's volume mapping support. the trick is to map it to a volume NOT named "/proc", since Docker will remount /proc/sys, among others, as read-only for non-privileged containers. This requires the host to mount /proc as rw, which is the case on most systems.

    docker run -it --rm -v /proc:/writable-proc ubuntu:14.04 /bin/bash
    root@edbee3de0761:/# echo 1024 > /writable-proc/sys/net/core/somaxconn
    root@edbee3de0761:/# sysctl net.core.somaxconn
    net.core.somaxconn = 1024
    

Method 2 should work on Elastic Beanstalk via its volume mapping support in Dockerrun.aws.json. Also it should work for other tunable parameters under /proc that's per-namespace. But this is most likely an oversight on Docker's part so they may add additional validation on volume mapping and this trick won't work then.

zliuva
  • 615
  • 5
  • 6
  • 1
    That's pretty clever answer, you really know your way around docker :) Sounds like there's no sense to fight it - if `/proc` was intentionally made writable only in privileged mode, I guess the future-proof solution is to ask AWS engineers to enable/allow it in EB. As the underlying EC2 machine is already "owned" by us, there shouldn't be any reason to disallow privileged mode... until then, I'll try your workaround tomorrow and report in! – Tuukka Mustonen Oct 05 '14 at 15:08
  • 2
    Well, as you suggested, the 2nd workaround is working perfectly on EB, so we'll be sticking to that for now. I'm not sure if I fully understand how modifying `/proc` (through `/writable-proc`) from inside container actually modifies container's namespace and not parent OS interfaces' namespace, from which it is mounted, but you've saved me a dozen of hours, so big thanks. I've also opened a question on Beanstalk forum about using privileged mode at https://forums.aws.amazon.com/thread.jspa?threadID=162290 – Tuukka Mustonen Oct 06 '14 at 08:28
  • 1
    This trick is reportedly not working anymore: http://serverfault.com/a/664589/60525 – chrishiestand Apr 27 '15 at 10:13
  • Why does this work at all? Afterwards, there are two proc mounts, the readonly at /proc and the writable at /writable-proc (or whatever). Why does *merely mounting the default namespaces' /proc* overwrite the usage of the values in the containers /proc/ ? – mknecht Oct 14 '15 at 08:33
8

docker 1.12 add support for setting sysctls with --sysctl.

docker run --name some-redis --sysctl=net.core.somaxconn=511 -d redis

docs: https://docs.docker.com/engine/reference/commandline/run/#/configure-namespaced-kernel-parameters-sysctls-at-runtime

eshizhan
  • 4,235
  • 2
  • 23
  • 23
4

I found a solution:

{
    "AWSEBDockerrunVersion": "1",
    "Command": "run COMMAND",
    "Image": {
        "Name": "crystalnix/omaha-server",
        "Update": "true"
    },
    "Ports": [
        {
            "ContainerPort": "80"
        }
    ]
}

more details here: /opt/elasticbeanstalk/hooks/appdeploy/pre/04run.sh

Update:

Add file .ebextensions/02-commands.config

container_commands:
    00001-docker-privileged:
        command: 'sed -i "s/docker run -d/docker run --privileged -d/" /opt/elasticbeanstalk/hooks/appdeploy/pre/04run.sh'
Egor Yurtaev
  • 127
  • 3
4

Update: This answer is obsolete as Docker now supports the docker run --sysctl option!

The solution I use for my OpenVPN container is to enter the container namespace with full capabilities using nsenter, remounting /proc/sys read-write temporarily, setting stuff up and remounting it read-only again.

Here an example, enabling IPv6 forwarding in the container:

CONTAINER_NAME=openvpn

# enable ipv6 forwarding via nsenter
container_pid=`docker inspect -f '{{.State.Pid}}' $CONTAINER_NAME`
nsenter --target $container_pid --mount --uts --ipc --net --pid \
   /bin/sh -c '/usr/bin/mount /proc/sys -o remount,rw;
               /usr/sbin/sysctl -q net.ipv6.conf.all.forwarding=1;
               /usr/bin/mount /proc/sys -o remount,ro;
               /usr/bin/mount /proc -o remount,rw # restore rw on /proc'

This way the container does not need to run privileged.

neingeist
  • 246
  • 3
  • 9
  • 1
    This is absolutely brilliant. Thanks so much for sharing! – Marius Jun 24 '16 at 22:16
  • This solution is the only one on this question which works in the latest Docker in an environment (Amazon ECS) which does not expose the `--sysctl` option from `docker run`. – ZiggyTheHamster Jun 01 '18 at 19:57
2

Just figured out how to solve this, now Elastic Beanstalk supports running a privileged containers and you just need to add the "privileged": "true" to your Dockerrun.aws.json as the following sample (please take a look at the container-1):

{
  "AWSEBDockerrunVersion": 2,
  "containerDefinitions": [{
    "name": "container-0",
    "essential": "false",
    "image": "ubuntu",
    "memory": "512"
  }, {
    "name": "container-1",
    "essential": "false",
    "image": "ubuntu",
    "memory": "512",
    "privileged": "true"
  }]
}

Please note that I duplicated this answer from another thread.

Community
  • 1
  • 1
herrera
  • 117
  • 1
  • 10
2

In docker 3.1 there is support for specifying sysctl. note the
sysctls:
    - net.core.somaxconn=1024

My example docker-compose file

version: '3.1'                                                                   
services:                                                                        
  my_redis_master:                                                             
    image: redis                                                                 
    restart: always                                                              
    command: redis-server /etc/redis/redis.conf                                  
    volumes:                                                                     
      - /data/my_dir/redis:/data                                         
      - /data/my_dir/logs/redis:/var/tmp/                                
      - ./redis/redis-master.conf:/etc/redis/redis.conf                          
    sysctls:                                                                     
      - net.core.somaxconn=1024                                                  
    ports:                                                                       
      - "18379:6379"                                   
nizam.sp
  • 4,002
  • 5
  • 39
  • 63
1

As in @nazim.sp answer Docker compose will support sysctls, I had the same issue as @Or Gal "Ignoring unsupported options:" however using a different syntax it was accepted Example stanza from docker-compose.yaml

redis:
  image: redis
  container_name: redis
  sysctls: 
    net.core.somaxconn: "1024"

source: https://rollout.io/blog/adjusting-linux-kernel-parameters-with-docker-compose/

I realize this should be a comment in the appropriate answer but hey newbie with no rep to add a comment you have to jump in and 'answer'

Ian Hayhurst
  • 11
  • 1
  • 4