5

I'm trying to run OpenGL applications (Gazebo) inside a Ubuntu 16.04 container, and I'd like to be able to take advantage of nvidia graphics acceleration when available. I'm trying to figure out what the recommended, officially supported (by nvidia, hopefully) way to achieve this is.

My requirements:

  1. Creating the image is time-consuming, so I'd like to either have one image for all kinds of graphics (nvidia, mesa i.e. everything else), or if separate, they should be built "FROM" a common base image with the bulk of the content.
  2. The nvidia container should work on different systems which may have different nvidia cards and driver versions installed.
  3. I need to use Ubuntu 16.04, company requires this, although this is the least-important of these requirements, e.g. if this could only be done on 18.04 I'd be interested as well.

What I've tried so far:

  • Just build separate images for nvidia and everything else, with FROM nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04. This works well, but requires to build two images, which takes twice the time and twice the disk space. Breaks requirement 1.
  • Build the "normal" (mesa/intel) image first from ubuntu:16.04, do all the time-consuming stuff there, then use this as the base for another image where NVIDIA drivers are installed manually from the official "run file". This works if the driver matches exactly the driver installed on the host, but not if the host has a different (e.g. older) version. Breaks requirement 2.
  • Do nothing, just run my regular mesa-enabled container with --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all. If I do, nvidia-smi sees the card, but OpenGL (e.g. glxinfo) still tries to load the swrast driver, which doesn't work.

Most examples I've seen in the wild use the nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 base, and for the life of me I cannot find how the nvidia drivers are installed (if at all) in that image. I've also read somewhere that with the nvidia container runtime (i.e. nvidia-docker2, which I'm using) you don't need to install the drivers, but that doesn't seem to be the case, at least not for OpenGL.

So again, is there a way to create container images for nvidia and non-nvidia that satisfy all my requirements, or do I just want too much?

Daniele
  • 410
  • 7
  • 19
  • Can't you configure Docker to bind mount single files from the host environment? I've never used Docker myself, but often lxc or voidlinux containers. And essentially you can "bind" single files from the outside into the container. `/dev` graphics nodes are there trivially, so it's all just about the OpenGL libraries (and if you want Vulkan the ICDs). So my approach would be to have some helper program that on the outside determines which OpenGL / Vulkan related library files there are, and bind mount those into the container at startup. – datenwolf Dec 15 '18 at 09:56
  • @datenwolf Yes, I can configure docker to bind mount single files, and more, nvidia-docker takes care of that already. The problem is not having the container "see" the card, the problem is having the right drivers inside to run OpenGL applications. – Daniele Dec 17 '18 at 19:46
  • Passing through the files I mentions **_IS_** passing through the drivers. Unfortunately the paths are kind of distribution specific, so the whole idea of distribution independence with docker images kind of leaves the the stage here. But if you take a look at the output of `ldd $(which glxinfo) | grep libGL` and bind the files reported there (make sure to resolve symlinks!) into the container, you actually pulled in the OpenGL drivers from outside, at least for clients. For Vulkan you may want to look at the contents of `/usr/share/vulkan/icd.d` – datenwolf Dec 17 '18 at 20:02
  • The pont here is, that graphics drivers in Linux consist of a kernel part and a userland part. The kernel part is shared among all of userspace, regardless of being inside a container or not. The userspace part has to match the kernel part. And the kernel driver is managed as part of the host system, so the only sensible approach is to bind the hosts systems' userland parts into the container. Which is, why you want to bind mount those particular files. – datenwolf Dec 17 '18 at 20:03
  • @datenwolf Yes exactly, I was aware of that, but you basically confirmed that. The nvidia container runtime (a.k.a nvidia-docker 2.x) is supposed to do exactly that, and it does, except apparently for OpenGL; I suspect it's mostly designed for Cuda, but it does have an "opengl" capability that can be enabled. – Daniele Dec 17 '18 at 20:06
  • So, yes, maybe I need to mount the host libGL, like you say, I thought about it, but then, look at `nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04` (official from nvidia), it somehow manages to work on every system, in conjunction with the nvidia container runtime, and I was hoping not to have to reverse engineer it. – Daniele Dec 17 '18 at 20:08

1 Answers1

6

Why waste time trying to figure out a solution yourself when you can just "steal" someone else's? Especially if that someone else is NVIDIA themselves.

Since nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 seems to work well, but using it as a base breaks requirement 1, I can just copy files out of it and into my image instead.

Here ${from} points to my original, non-nvidia-aware container image (but I also tested it with from=ubuntu:16.04), and I just copy nvidia's drivers and configuration over:

ARG from
FROM nvidia/opengl:1.0-glvnd-runtime-ubuntu16.04 as nvidia
FROM ${from}

COPY --from=nvidia /usr/local /usr/local
COPY --from=nvidia /etc/ld.so.conf.d/glvnd.conf /etc/ld.so.conf.d/glvnd.conf

ENV NVIDIA_VISIBLE_DEVICES=all NVIDIA_DRIVER_CAPABILITIES=all

With this, and my ${from} built on top of ubuntu:16.04, glxinfo returns the expected configuration (NVIDIA being the GL vendor), and I can run Gazebo, Blender, etc. just like on the host. The beauty of this is, the resulting container works even when not using the nvidia runtime, on a system without nvidia drivers, it just gracefully falls back to using Mesa (I guess that's what "glvnd" does).

While I am currently required to use Ubuntu 16.04, I see no reason why a similar approach wouldn't work for other Ubuntu versions.

Daniele
  • 410
  • 7
  • 19