Intel Optimized Tensorflow not supporting oneDNN

Question

Case 1

Framework: Tensorflow 2.5.0, Intel-Tensorflow 2.5.0

Environment: Google Colab

I have a successfully quantized model quantized by LPOT that is to be run for inference without using LPOT API, so I wrote the following inference code:

with tf.compat.v1.Session() as sess:
    tf.compat.v1.saved_model.loader.load(sess, ['serve'], model)
    output = sess.graph.get_tensor_by_name(output_tensor_name)
    predictions = sess.run(output, {input_tensor_name: x})
    mse = tf.reduce_mean(tf.keras.losses.mean_squared_error(y, predictions))
    print(mse.eval())

When running the line predictions = sess.run(output, {input_tensor_name: x}):

---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1374     try:
-> 1375       return fn(*args)
   1376     except errors.OpError as e:

7 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1359       return self._call_tf_sessionrun(options, feed_dict, fetch_list,
-> 1360                                       target_list, run_metadata)
   1361 

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1452                                             fetch_list, target_list,
-> 1453                                             run_metadata)
   1454 

InternalError: Missing 0-th output from {{node model/layer_1/Conv2D_eightbit_requantize}}

During handling of the above exception, another exception occurred:

InternalError                             Traceback (most recent call last)
<ipython-input-6-2bddd853d111> in <module>()
      2     tf.compat.v1.saved_model.loader.load(sess, ['serve'], model)
      3     output = sess.graph.get_tensor_by_name(output_tensor_name)
----> 4     predictions = sess.run(output, {input_tensor_name: x[:64]}) # 64, 257, 60, 1
      5     mse = tf.reduce_mean(tf.keras.losses.mean_squared_error(y[:64], predictions))
      6     print(mse.eval())

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    966     try:
    967       result = self._run(None, fetches, feed_dict, options_ptr,
--> 968                          run_metadata_ptr)
    969       if run_metadata:
    970         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1189     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1190       results = self._do_run(handle, final_targets, final_fetches,
-> 1191                              feed_dict_tensor, options, run_metadata)
   1192     else:
   1193       results = []

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1367     if handle is None:
   1368       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1369                            run_metadata)
   1370     else:
   1371       return self._do_call(_prun_fn, handle, feeds, fetches)

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1392                     '\nsession_config.graph_options.rewrite_options.'
   1393                     'disable_meta_optimizer = True')
-> 1394       raise type(e)(node_def, op, message)
   1395 
   1396   def _extend_graph(self):

InternalError: Missing 0-th output from node model/layer_1/Conv2D_eightbit_requantize (defined at <ipython-input-6-2bddd853d111>:2)

This error happens with or without Intel-Tensorflow==2.5.0 installed, nor is it resolved when os.environ['TF_ENABLE_ONEDNN_OPTS'] = '1' is set explicitly.

On the other hand, when I run the same code in VS Code with Python 3.6.8 64-bit base: Conda, it returns the same error message as in Case 2.

Case 2

Framework: Tensorflow 2.4.0, Intel-Tensorflow 2.4.0

Environment: Google Colab

This case works well and prints out the MSE loss of the predictions, but when I uninstall Intel-Tensorflow 2.4.0 and run with official Tensorflow only, while running the same line in Case 1 (predictions = sess.run(output, {input_tensor_name: x})):

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1374     try:
-> 1375       return fn(*args)
   1376     except errors.OpError as e:

7 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1357       # Ensure any changes to the graph are reflected in the runtime.
-> 1358       self._extend_graph()
   1359       return self._call_tf_sessionrun(options, feed_dict, fetch_list,

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _extend_graph(self)
   1397     with self._graph._session_run_lock():  # pylint: disable=protected-access
-> 1398       tf_session.ExtendSession(self._session)
   1399 

InvalidArgumentError: No OpKernel was registered to support Op 'QuantizedMatMulWithBiasAndDequantize' used by {{node model/dense/Tensordot/MatMul_eightbit_requantize}} with these attrs: [input_quant_mode="MIN_FIRST", T1=DT_QUINT8, Toutput=DT_FLOAT, T2=DT_QINT8, Tbias=DT_QINT32, transpose_a=false, transpose_b=false]
Registered devices: [CPU]
Registered kernels:
  <no registered kernels>

     [[model/dense/Tensordot/MatMul_eightbit_requantize]]

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-6-2bddd853d111> in <module>()
      2     tf.compat.v1.saved_model.loader.load(sess, ['serve'], model)
      3     output = sess.graph.get_tensor_by_name(output_tensor_name)
----> 4     predictions = sess.run(output, {input_tensor_name: x[:64]}) # 64, 257, 60, 1
      5     mse = tf.reduce_mean(tf.keras.losses.mean_squared_error(y[:64], predictions))
      6     print(mse.eval())

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    966     try:
    967       result = self._run(None, fetches, feed_dict, options_ptr,
--> 968                          run_metadata_ptr)
    969       if run_metadata:
    970         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1189     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1190       results = self._do_run(handle, final_targets, final_fetches,
-> 1191                              feed_dict_tensor, options, run_metadata)
   1192     else:
   1193       results = []

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1367     if handle is None:
   1368       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1369                            run_metadata)
   1370     else:
   1371       return self._do_call(_prun_fn, handle, feeds, fetches)

/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1392                     '\nsession_config.graph_options.rewrite_options.'
   1393                     'disable_meta_optimizer = True')
-> 1394       raise type(e)(node_def, op, message)
   1395 
   1396   def _extend_graph(self):

InvalidArgumentError: No OpKernel was registered to support Op 'QuantizedMatMulWithBiasAndDequantize' used by node model/dense/Tensordot/MatMul_eightbit_requantize (defined at <ipython-input-6-2bddd853d111>:2)  with these attrs: [input_quant_mode="MIN_FIRST", T1=DT_QUINT8, Toutput=DT_FLOAT, T2=DT_QINT8, Tbias=DT_QINT32, transpose_a=false, transpose_b=false]
Registered devices: [CPU]
Registered kernels:
  <no registered kernels>

     [[model/dense/Tensordot/MatMul_eightbit_requantize]]

The error persists even with os.environ['TF_ENABLE_ONEDNN_OPTS'] = '1' set explicitly.

Conclusion

I believe both cases are caused by the same type of error, i.e. No OpKernel was registered to support Op ...

I was given to understand that with official Tensorflow v2.5 installed and the environment variable TF_ENABLE_ONEDNN_OPTS=1 set (reference), the quantized model is supposed to run with oneDNN supported. But it doesn't seem to be the case in neither v2.4 nor v2.5.

My question is how do I have official Tensorflow 2.5 environment with oneDNN support without having to install Intel-Tensorflow? Or why does Intel-Tensorflow 2.5 not work? Thanks.

score 0 · Answer 1 · edited Jul 29 '21 at 12:25

LPOT is released in Intel® AI Analytics Toolkit and works with Intel Optimization of TensorFlow. The LPOT can run on any Intel CPU to quantize the AI model. Intel Optimized TensorFlow 2.5.0 requires setting environment variable TF_ENABLE_MKL_NATIVE_FORMAT=0 before running LPOT quantization or deploying the quantized model.

Please refer to this for more information.

Could you please check whether you quantized the model in Tensorflow in 2.4 and running the inference on Tensorflow 2.5? A plausible explanation of the model not running in Tensorflow 2.5 and running in Tensorflow 2.4 is that the operators supporting Tensorflow 2.5 may not support the model created in Tensorflow 2.4.

Intel Optimized Tensorflow not supporting oneDNN

Case 1

Case 2

Conclusion

1 Answers1