Sorry, I don't put on any code to prove
The problem is:
when I use a simple for loop like this:
for _ in range(2000):
rnum = random.randint(1, 5)
img = np.random.rand(rnum, 3, 112, 112)
mxnet_model.inference(img)
it will work fine
However, if I cover code above with a flask API
It will cause gpu memory leak... which is very terrible