I am noticing something strange when I am inferencing from a TensorsRT graph. As I inference more frames in series the overall time per frame reduces. The data is as follows:
Frames | Time | Rate |
---|---|---|
1 frame | 6sec | 0.1FPS |
3 frames | 12sec | 0.25FPS |
30 frames | 6sec | 5FPS |
100 frames | 7.25sec | 13.7FPS |
1000 frames | 31.337sec | 32FPS |
10000 frames | 175.118sec | 57FPS |
100000 frames | 1664.778sec | 60FPS |
I have also calculated the time ignoring the first 15 inferences calls and it appears to follow the same pattern. So this rules out the time to initialize the graph for the first few inferences.
This model is a simple MobileNetV2 and running on a jetson nano 4gb.
Code snippet of inference:
start_time=time.time()
for i in range(n_frames):
output = frozen_func(get_img_tensor(i))[0].numpy()
ans_arr.append(output)
end_time=time.time()
print("time taken - ",end_time-start_time)
What is happening here?