Edit
The numbering is consistent. I quote from @Robert Crovella
The ordering is consistent across processes, and consistent from run
to run. This statement is true whether you select the default CUDA
numbering, or the PCI based ordering. The run to run statement is
true as long as you don't switch CUDA versions, update the system
BIOS, change operating systems, change the hardware configuration of
the system (e.g. add/remove devices), or change from default to PCI
ordering. It also assumes you make no changes to the
CUDA_VISIBLE_DEVICES
environment variable.
Device Enumeration and Properties, has a variable named CUDA_DEVICE_ORDER with two possible values, FASTEST_FIRST and PCI_BUS_ID.
The documentation says, FASTEST_FIRST causes CUDA to guess which device is fastest using a simple heuristic, and make that device 0, leaving the order of the rest of the devices unspecified. PCI_BUS_ID orders devices by PCI bus ID in ascending order.
By default, this environment variable is set to FASTEST_FIRST. Therefore, it could potentially generate different IDs for the devices compared to PCI_BUS_ID if you devices happen to have different speeds.
You can set CUDA_DEVICE_ORDER via:
export CUDA_DEVICE_ORDER=PCI_BUS_ID
And this ID will be unique.
Or in host code you find the deviceId:
int dev = 0;
cudaError_t errCode = cudaDeviceGetByPCIBusId(&dev, "somebusId");
cudaSetDevice(dev);