Simple way to merge multiple source files into one fatbinary

Question

To simplify the build process in a project, I'd like to compile multiple source files into device PTX code, and have all those modules in a single .fatbin file to be linked later.

I can achieve this currently through either compiling each file individually to .ptx, or compiling all simultaneously while using --keep to keep intermediate files, then adding each to a fatbinary explicitly:

nvcc -c --keep mysource1.cu mysource2.cu ...
fatbinary --create="mysources.fatbin" --image3=kind=ptx,file=mysource1.ptx --image3=kind=ptx,file=mysource2.ptx ...

This is quite cumbersome though, so I was wondering if there is a simpler/more terse way of doing so, perhaps in a single nvcc invocation. I've tried calling nvcc --fatbin --device-link on multiple source files, but that does not seem to keep the ptx code in the output fatbinary (at least not when inspecting with cuobjdump).

I'm not really sure that your method using `nvcc -c ...` and `fatbinary ...` is actually doing device linking. If your code in `mysource1.cu ...` etc actually required it, your first compile command would fail with an error (ptxas unresolved extern). So if we leave device linking aside, it seems to me anyway that you can do this with a library: `nvcc -arch=sm_XX --lib -rdc=true -o lib.a mysource1.cu ...` If your code requires device linking you would specify that later, when you link your library. — Robert Crovella, Nov 25 '21 at 00:04
Indeed I don't think device linking is necessary in my case, I was just trying out scenarios that seemed to accept multiple input source files. On creating a static library: that actually works! I initially disregarded it because I thought they only supported a single architecture, and only fatbinaries could contain PTX files for different architectures. I just now realized you can still feed nvcc multiple `--gencode=arch,code` arguments with `--lib` set. Thanks @RobertCrovella! — brenocfg, Nov 25 '21 at 00:41

score 1 · Accepted Answer · answered Nov 26 '21 at 15:23

One possible approach here would be to use a library. The command could look something like this:

nvcc -gencode arch=compute_XX,code=sm_XX  -gencode ... --lib -rdc=true -o libmy.a mysource1.cu ...

The above command could be used in the case where you know device linking will eventually be necessary. In that case, you would specify the device-link step later, when you link objects or your final executable against the static library.

For the case where you know that device linking will not be necessary, just omit the -rdc=true switch.

Simple way to merge multiple source files into one fatbinary

1 Answers1