I'm trying to figure out how to use nvidia-docker (https://github.com/NVIDIA/nvidia-docker) using https://docs.ansible.com/ansible/latest/docker_container_module.html#docker-container.
Problem
My current Ansible playbook execute my container using "docker" command instead of "nvidia-docker".
What I have done
According to some readings, I have tried adding my devices, without success
docker_container:
name: testgpu
image: "{{ image }}"
devices: ['/dev/nvidiactl', '/dev/nvidia-uvm', '/dev/nvidia0', '/dev/nvidia-uvm-tools]
state: started
note I tried different syntax for devices (inline ..), but still getting the same problem
This command does not throws any error. As expected it creates a Docker container with my image and try to start it.
Looking at my container logs:
terminate called after throwing an instance of 'std::runtime_error'
what(): No CUDA driver found
which is the exact same error I'm getting when running
docker run -it <image>
instead of
nvidia-docker run -it <image>
Any ideas how to override docker command when using docker_container with Ansible?
I can confirm my CUDA drivers are installed, and all the path /dev/nvidia* are valid.
Thanks