How to do inference for a FP16 TensorFlow model with non-fixed input shape in C++?

Question

The model was trained with Python. I have looked into different ways but hit the wall here or there. I summarize as below and please correct me if I am wrong

+--------------------+-----+-------+-----------------+
|                    | C++ | FP 16 | non-fixed shape |
+--------------------+-----+-------+-----------------+
| TensorFlow C++ API | ✓   | ?     | ✓               |
| TensorRT           | ✓   | ✓     | X               |
| TF-TRT             | X   | ✓     | ✓               |
+--------------------+-----+-------+-----------------+

At least TensorRT 5.1 still doesn't support non-fixed input shape
TF-TRT = tensorflow/tensorrt and for the moment it is only avaiable in Python
I am aware of TensorRT Inference Server but I don't want to go into a network communication based solution if I don't have to

The "?" in the table means the interplay with Eigen::half under tensorflow/core/kernels (e.g., inside conv_2d_gpu_half.cu.cc) to achieve FP 16 arithmetic with TensorFlow C++. I don't see many documentation on this but is it the only way to go?

(I am fine with converting my model to other frameworks like MXNet but similar limitations seem apply: just change TensorFlow C++ API→MXNet C++ Package, TensorRT→TVM, and TF-TRT→MXNet-TensorRT in the table)

How to do inference for a FP16 TensorFlow model with non-fixed input shape in C++?

0 Answers0