0

This is a follow up to my previous question in which I indicated CPU and GPU slow training performance. I did another test run on google colab and used wandb to monitor the training and show CPU, GPU utilization and several other metrics, it shows 0% GPU utilization which confirms the GPU is not in use and tensorflow is just ignoring the activation or I don't know.

I activate the GPU using (tf 2.3.1):

physical_devices = tf.config.experimental.list_physical_devices('GPU')
if len(physical_devices) > 0:
    tf.config.experimental.set_memory_growth(physical_devices[0], True)

Graphs within a few minutes of training:

report-1

report-2

report-3

watch-this
  • 1
  • 4
  • 20
  • Just in case have you checked in logs that gpu actually found by tensorflow? – Alex K. Dec 09 '20 at 20:28
  • yes, the GPU is present in all the recent trial runs, I'm starting to believe that this is not more of an implementation problem and less of a tensorflow problem. – watch-this Dec 09 '20 at 20:59
  • @bullseye It's hard to judge without actual runnable code sample. Could you share the colab notebook (or its copy)? – tornikeo Dec 10 '20 at 13:33
  • @tornikeo you may use this [notebook](https://colab.research.google.com/gist/amahendrakar/e09b444c63660052f3f679a954cdd587/45546.ipynb) – watch-this Dec 10 '20 at 15:09
  • @bullseye The notebook runs into an error as soon as the training script starts executing. Maybe that's why GPU usage is not showing up. you might wanna check the notebook for errors – Ayush Chaurasia Jan 29 '21 at 19:10

0 Answers0