I am working on a voice assistant using RASA NLU that will be later deployed on an android/IOS mobile app. I currently have a python program, where I call the RASA NLU model and parse incoming input using the Interpreter class.
interpreter = Interpreter.load("training_data/models/nlu/default/current/")
interpreter.parse(input)
The answer I get is stored in a JSON object where I parse this to get the intent + associated entities. I later take this output and use an AIML interpreter to get the corresponding response for it. I store all the responses in AIML files.
My problem is now that all of this code runs locally on my machine. I want to load my RASA NLU model on a server and then use some kind of APIs to request a response from the model and send it to the AIML kernel (also stored on the server). The model should always be running on the server, and I will be sending requests from the mobile app. Any suggestions?