Docker and -march native

Question

My application benefits greatly from advanced CPU features that gcc can access when it is run with -march native. Docker can smooth over differences in OS, but how does it handle different CPUs? To build an application that can run on any CPU I would have to build for amd64, losing out on a lot of performance. Is there a good way to distribute Docker images when the application needs to be compiled separately for each CPU architecture?

One general solution to this problem is to build multiple versions and pick the best one at runtime. (e.g. with a helper script). Or if your code is a dynamic library, with dynamic library tricks that resolve the function pointer to a version for the CPU that's running it. But if your program isn't a shared library, and you don't want runtime dispatch overhead *inside* your program, then yeah building multiple versions and choosing one to run with a script or CPU-detection wrapper program is one way to go. (e.g. a C or asm program that runs `cpuid`, or a shell script parsing /proc/cpuinfo) — Peter Cordes, Jun 19 '19 at 01:02

Margaret Bloom · Accepted Answer · 2019-07-03T06:59:59.880

9

Docker doesn't handle CPU at all. It is just a composition of kernel namespacing, FS system layering (e.g. UnionFS) and process quoting.
When you run something on a docker container it is just an executable running on your OS, without virtualisation, it has access only to a selected set of kernel objects (e.g. devices) and it is chrooted to a FS hierarchy resulting from overlaying vary FSs (including the one in the docker container).

Hence, Docker doesn't handle the CPU at all, it is completely orthogonal to your problem.

As Peter commented there are essentially two ways to CPU-dispatch:

You load the right dynamic library (but every function call into the library uses a pointer).
You build multiple versions of the same statically-linked binary and run the right one.

The main issue is that sometime ISA extensions are orthogonal and this makes the combinations (i.e. the number of libraries/binaries) grow exponential. So, considering that you are dealing with the Docker's userbase you can simplify the approach a bit (if combinations are a problem):

Either make some ISA extensions required (if the absence of such would degrade the performance too much). For the optional extensions you can use one of the approaches of a above.
Create only a few baseline containers. E.g. One for the generic amd64, one for amd64-avx, one for amd64-avx2-aesni-tsx and similar. The idea being to create only a few containers that covers all, most and few of your users.

EDIT
As BeeOnRope pointed in the comments, Dockers has a version running on Windows. It uses Hyper-V to run a Linux VM with the Linux version of docker.
As Hyper-V is a native VMM, apart from an extra layer, the same considerations apply.
Similarly, there is a macOS version too. This time it uses an hypervisor framework based on xhyve.

edited Jul 03 '19 at 06:59

answered Jun 19 '19 at 11:16

Margaret Bloom

41,768
5
78
124

1

Good answer! It might be worth noting that in any scenario other than Linux-on-Linux (using the notation _guest_-on-_host_), a VM *is* used. So Docker is effectively a completely different technology outside the L-on-L scenario (but I doubt the VM is going to emulate host-unsupported instructions). – BeeOnRope Jul 02 '19 at 23:35
1

@BeeOnRope good point! I was completely unaware of Docker for Windows. Thank you, I'll edit this into the answer. – Margaret Bloom Jul 03 '19 at 06:43
There's docker for OSX too. In terms of pure scenarios, L-on-L is in the mintority, although in terms of actual use, I'm not sure. – BeeOnRope Jul 03 '19 at 06:50
1

@BeeOnRope [Oh right! this time it bundles a custom VMM](https://news.ycombinator.com/item?id=11352594). – Margaret Bloom Jul 03 '19 at 06:58
1

I wasn't suggesting statically linked binaries. But sure, if you have a case where executable *and* libraries both benefit from being compiled with `-march=whatever`, then static linking is probably a good way to do it. And with Docker, you're maybe not losing out on the opportunity for library read-only pages to be shared with other tasks anyway, since the libc inside your docker image is a separate file than the libc in another docker image, or on the host. (I think?) And certainly for a custom library. – Peter Cordes Jul 03 '19 at 07:04
@PeterCordes I was thinking a scenario where the OP is using pushing all the march related code in multiple version of the same library. Then they can easily implement either choices. But it also goes for std libraries. – Margaret Bloom Jul 03 '19 at 07:11
I might build custom libs as static (.a) libraries so *they* get statically linked into the different executables, but you still dynamically link with the standard libraries. I forget what happens with glibc's dynamic dispatch for memcpy (SSE2 vs. AVX) when you statically link libc. But it will remove PLT or GOT-indirect overhead for library calls. – Peter Cordes Jul 03 '19 at 07:41

Docker and -march native

1 Answers1

Linked