3

I'm trying to load ResNext50, and on top of it CenterNet, I'm able to do it with Google Colab or Kaggle's GPU. But,

  1. Would love to know how much GPU Memory (VRAM) does this network need?

  2. When using RTX 2070 with free 5.5GB VRAM left on it (out of 8GB), I'm not able to load it.

Batch size is 1, #of workers is 1, everything is set to minimum values. OS: Ubuntu 18.04 (Using PyTorch)

In TensorFlow, I know that I can restrict the amount of VRAM (which enables me to load and run networks although I don't have enough VRAM), but in PyTorch I didn't find this functionality yet.

Any ideas how to solve this?

Ilan Aizelman
  • 49
  • 1
  • 7

1 Answers1

2

Using third party dependency

You could get size of model in bytes using third party library torchfunc (disclaimer I'm the author).

import torchfunc

# Assuming model is loaded
print(torchfunc.sizeof(model))

No dependencies

This function is pretty simple and shirt, you could just copy it, see source code

Exact size

It's just size of your model, there is more VRAM memory usage during forward and backward and depends on size of your batch. You could try pytorch_modelsize library for estimations of those (not sure whether it will work for your networks though).

Clear cache

You should clear your cache before running network (sometimes restarting workstation helped in my case), as you definitely have enough memory. Killing processes using GPU should help as well.

Once again torchfunc could help, issue the following command:

import torchfunc

torchfunc.cuda.reset()
Szymon Maszke
  • 22,747
  • 4
  • 43
  • 83
  • Thank you Szymon, I'm def. gonna check it soon and update! (Btw, I figured that it takes 5.5GB of Memory, I think I should be able to clear some GPU Memory and be able to run it). Btw, do you suggest to train on fp16 weights for increased speed? – Ilan Aizelman Jan 14 '20 at 13:49
  • 1
    Yeah, I don't see many downsides to mixed precision training (use `apex` in such case), may be considered as additional regularization as well. – Szymon Maszke Jan 14 '20 at 16:15