12

I'm getting this message:

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

How do I do that in Tensorflow 2.3?

Over the past few days this turned out to be a surprisingly frustrating issue. There appears to be no working example of how to do this in TF2.

mafu
  • 31,798
  • 42
  • 154
  • 247
  • Does this help? https://stackoverflow.com/questions/49665757/how-to-add-report-tensor-allocations-upon-oom-to-runoptions-in-keras – rvinas Jan 23 '21 at 19:56
  • 1
    @rvinas I sadly cannot test it currently, but I recall having seen that link. It did not solve the problem for me. Iirc because it was aiming at TF1 and was not compatible with or convertible to TF2. Would love to be proven wrong, tho. – mafu Jan 25 '21 at 04:59
  • Any news on the issue? I'm stills tuck with the ```TypeError: Invalid keyword argument(s) in 'compile': {'run_metadata', 'options'}``` when using tf1.compat and I don't know what to do with the other suggested answer... – Manuel Popp Oct 02 '21 at 13:41

2 Answers2

2

This is still a long way from an allocated tensor list, but a start for TF2:

Tensorflow 2.4.1 contains the tf.config.experimental.get_memory_usage method, which returns the current number of bytes used on the GPU. Comparing this value across different points in time can shed some light on which tensors take up VRAM. It seems to be pretty accurate.

BTW, the latest nightly build contains the tf.config.experimental.get_memory_info method instead, seems they had a change of heart. This one contains the current, as well as the peak memory used.

Example code on TF 2.4.1:

import tensorflow as tf

print(tf.config.experimental.get_memory_usage("GPU:0"))  # 0

tensor_1_mb = tf.zeros((1, 1024, 256), dtype=tf.float32)
print(tf.config.experimental.get_memory_usage("GPU:0"))  # 1050112

tensor_2_mb = tf.zeros((2, 1024, 256), dtype=tf.float32)
print(tf.config.experimental.get_memory_usage("GPU:0"))  # 3147264

tensor_1_mb = None
print(tf.config.experimental.get_memory_usage("GPU:0"))  # 2098688

tensor_2_mb = None
print(tf.config.experimental.get_memory_usage("GPU:0"))  # 1536
Synthesis
  • 463
  • 4
  • 5
-2

In order to load RunOptions you need to use RunMetadata. Both of them for TF2 can be found in the tf.compat.v1 package. The following code works for keras 2.4.3 with a tensorflow 2.4.0 backend:

#Model build()
import tensorflow as tf
run_opts = tf.compat.v1.RunOptions(report_tensor_allocations_upon_oom=True)
runmeta = tf.compat.v1.RunMetadata()
keras_model.compile(optimizer=..., loss=..., options = run_opts, run_metadata=runmeta)
#Model fit()