I am managing several compute hosts running Ubuntu 18.04 (with systemd) and Docker inside a mostly trusted network.
I have an auth server so rather than manually add users to the docker group, so that they can run docker commands, I made a group ldap-docker on the auth server and add my users to that. Then I added "group": "ldap-docker" to /etc/docker/daemon.json file, and removed the local "docker" group from the systems.
This works fine on several hosts, but on some of them docker.service won't start because /var/run/docker.sock is still owned by root:root rather than root:ldap-docker. the docker.socket.service also reports a failure to start
$ docker ps
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get http://%2Fvar%2Frun%2Fdocker.sock/v1.40/containers/json: dial unix /var/run/docker.sock: connect: permission denied
$ sudo systemctl status docker.socket
● docker.socket - Docker Socket for the API
Loaded: loaded (/lib/systemd/system/docker.socket; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2021-03-12 08:11:48 PST; 8h ago
Listen: /var/run/docker.sock (Stream)
Mar 12 08:11:48 host.example.com systemd[1]: Starting Docker Socket for the API.
Mar 12 08:11:48 host.example.com systemd[1171]: docker.socket: Failed to resolve group docker: Connection refused
Mar 12 08:11:48 host.example.com systemd[1]: docker.socket: Control process exited, code=exited status=216
Mar 12 08:11:48 host.example.com systemd[1]: docker.socket: Failed with result 'exit-code'.
Mar 12 08:11:48 host.example.com systemd[1]: Failed to listen on Docker Socket for the API.
I can run sudo chgrp on the /var/run/docker.sock socket, but the docker service has already failed to start so that doesn't help.
How do you control the startup of the docker.sock.service? and why would my setup work okay on some machines but not others?