1

I'm trying to use keras-rl to train and use an AI for a game that's written in C++ with Python bindings. This is my first time using keras-rl, and I'm finding its expectations at odds with the way the game AI interface is implemented.

As far as I understand, keras-rl expects to drive the reinforcement learning itself via a Gym Environment with the following methods:

def reset(self) -> observation # start a new episode, get first observation
def step(self, action) -> observation, reward, done, info # provide an action, get next observation, etc

However, the C++ application expects to call the AI via the following methods:

def start(self, episode_info) -> None # start a new episode
def select(self, observation) -> action # provide an observation, get back an action
def end(self, reward) -> None # end an episode (with a reward)

How can I best reconcile these two seemingly contradictory approaches? Do I need to run keras-rl in a separate process and have them communicate somehow?

(Note that the application allows running the AI repeatedly against other AIs, which it specifies in the episode_info and which is how I'm planning to differentiate training from usage.)

Uri Granta
  • 1,814
  • 14
  • 25

0 Answers0