I am working on a PyTorch project, and I want to disable data parallelization to ensure that each program runs on a single specified GPU, avoiding memory duplication. I have followed the standard steps of moving the model to the desired GPU device and disabling data parallelization. However, when I simultaneously launch multiple instances of the program, I observe memory duplication across multiple GPUs.
Here are the steps I have taken:
I move the model to the desired GPU device using model.to(device), where device is set to a specific GPU device (e.g., torch.device("cuda:0")).
I disable data parallelization as follows:
model = model.to(device)
model = torch.nn.DataParallel(model, device_ids=[device])
Despite these steps, the memory is still duplicated across multiple GPUs when multiple instances of the program are running simultaneously. I want to ensure that each program uses only its designated GPU without memory duplication.
The nvidia-smi result are as follows:
When I just activate single GPU:
When I activated two GPUs but only run one python program and the memory got almost duplicated:
The whole program part related to this problem are as follows:
def load_model_on_lowest_memory_gpu(model):
available_gpus = GPUtil.getAvailable(order='memory', limit=torch.cuda.device_count())
selected_gpu = torch.device("cuda:{}".format(available_gpus[0]))
print(selected_gpu)
model = model.to(selected_gpu)
model = torch.nn.DataParallel(model, device_ids=[selected_gpu])
return model
# in __main__:
net = AlexNet.AlexNet(8)
net.load_state_dict(torch.load(dict_path))
net = load_model_on_lowest_memory_gpu(net)
# when using the net: (images are the input images)
outputs = net(images.cuda())
Am I missing something in the configuration or is there another step I should follow to achieve this? Any help or guidance would be greatly appreciated.