This is no longer possible, the Rasa documentation you are pointing to is for version 0.8 they are now on version 0.12. There are several factors contributing to why support for this was removed, primarily:
- High memory usage for language models
- Move from Flask to Klein for asynchronous training
Here's a Github issue with some more information: https://github.com/RasaHQ/rasa_nlu/issues/793
If you were going for higher overall throughput of /parse
requests then the recommendation is to use Docker combined with nginx to run multiple instances on the same server - if the server is large enough to handle it - or run multiple smaller instances, still with an nginx reverse proxy.
Note that training has already been moved into separate processes. The number of processes available for training can be set with the --max_training_processes
argument. Also some components of the Rasa pipeline support multiple threads. The number of threads available to these pipeline components can be set with the --num_threads
argument.