Tensorflow - Interpreting the tf.estimator.ProfilerHook "_Send" op

Asked Nov 10 '20 at 12:44

Active Nov 10 '20 at 12:44

Viewed 124 times

I have a deep CNN/RNN that I train on Google AI platform. I distribute the training on 8 GPUs using the tf.distribute.MirroredStrategy. I recently upgraded my runtime version from 1.13 to 1.15 and my training is more than 2x slower than before. I read that tf.estimator.ProfilerHook can be used to identify performance bottlenecks. So I collected the profiling information and rendered it at chrome://tracing. I got this

A training step spends an entire 1 second on these _Send ops. What is this? I can't find any documentation on the op or why it's in my graph. What does this mean?

asked Nov 10 '20 at 12:44

Andy Carlson

3,633
24
43

Tensorflow - Interpreting the tf.estimator.ProfilerHook "_Send" op

0 Answers0