I have a bunch of Java code that constitutes an environment and an agent. I want to use one of the Python reinforcement learning libraries (stable-baselines, tf-agents, rllib, etc.) to train a policy for the Java agent/environment. And then deploy the policy on the Java side for production. Is there standard practice for incorporating other languages into Python RL libraries? I was thinking of one of the following solutions:
- Wrap Java env/agent code into REST API, and implement custom environment in Python that calls that API to step through the environment.
- Use Py4j to invoke Java from Python and implement custom environment.
Which one would be better? Are there any other ways?
Edit: I ended up going the former - deploying a web server that encapsulates the environments. works quite well for me. Leaving the question open in case there is a better practice to handle this kind of situations!