I am searching for options that enable dynamic cloud-based NVIDIA GPU virtualization similar to the way AWS assigns GPUs for Cluster GPU Instances.
My project is working on standing up an internal cloud. One requirement is the ability to allocate GPUs to virtual-machines/instances for server-side CUDA processing.
USC appears to be working on OpenStack enhancements to support this but it isn't ready yet. This would be exactly what I am looking for if it were fully functional in OpenStack.
NVIDIA VGX seems to only support allocation of GPUs to USMs, which is strictly remote-desktop GPU virtualization. If I am wrong, and VGX does enable server-side CUDA computing from virtual-machines/instances then please let me know.