I have a Flask app that runs a tensorflow model, which is deezer/spleeter
, but I believe the problem is related to tensorflow not spleeter, when I call the Flask API, the GPU memory is utilized but the problem is when the process ends, the GPU memory is still utilized until I restart the Flask server which is not what it should be. I've tried to use this model in a simple python shell, and it does the same thing (the GPU memory is still utilized until I stop the Python process). The single solution I found that half works is to use multiprocessing.Process
, so this solution worked in the Python shell, but not in the Flask app, I get this type of errors:
Allocation of XXXXXXXXX exceeds 10% of free system memory.
Here is how my code looks like (unfortunately I cannot share all the code):
from multiprocessing import Process
def function(...):
# Call the deezer/spleeter model
@app.route('/api', methods=['POST'])
def api():
process = Process(target=function, args=(...))
process.start()
process.join()
return data
But if I do this in a Python shell it works. Also I have to mention that this runs in a docker container under the tensorflow/tensorflow:latest-gpu
image.
I kindly request your assistance in finding a solution to this issue. If you have any alternative suggestions, please feel free to share them as well.