2

I am converted the yolov2 frozen graph to tftrt graph using following code.

OUTPUT_NAME = ["models/convolutional23/BiasAdd"]

# read Tensorflow frozen graph
with gfile.FastGFile('./yolov2_frozen-graph.pb', 'rb') as tf_model:
   tf_graphf = tensorflow.GraphDef()
   tf_graphf.ParseFromString(tf_model.read())

# convert (optimize) frozen model to TensorRT model
trt_graph = trt.create_inference_graph(input_graph_def=tf_graphf, outputs=OUTPUT_NAME, max_batch_size=1, max_workspace_size_bytes=2 * (10 ** 9), precision_mode="FP32")

# write the TensorRT model to be used later for inference
with gfile.FastGFile("Yolo_TensorRT_modelFP16.pb", 'wb') as f:
   f.write(trt_graph.SerializeToString())
print("TensorRT model is successfully stored!")

Then after that I am running inference using following code.

with tf.Session() as sess:
    img = cv2.imread("image3.jpg")
    img = cv2.resize(img, (608, 608))
    
    # read TensorRT frozen graph
    with gfile.FastGFile('Yolo_TensorRT_modelFP16.pb', 'rb') as trt_model:
        trt_graph = tf.GraphDef()
        trt_graph.ParseFromString(trt_model.read())

    # obtain the corresponding input-output tensor
    tf.import_graph_def(trt_graph, name='')
    input = sess.graph.get_tensor_by_name('models/net1:0')
    output = sess.graph.get_tensor_by_name('models/convolutional23/BiasAdd:0')

    for i in range(100):
        start = time.time()
        # perform inference
        sess.run(output, feed_dict={input: [np.asarray(img)]})
        end =  time.time() - start
        print("infernce time: ", end)

So it is giving exactly same performance as normal yolov2 frozen graph even after I am running inference on FP16 yolov2 frozen tftrt graph. Can you tell me what I have to do to increase performance with tftrt graph?

0 Answers0