I have a simulation in the Simulink. There are multiple instances where it needs an action based on the simulation state at that point. I use Python to run the Reinforcement Learning.
My Implementation: Simulink simulation is triggered from python using the 'MATLAB Engine for Python'. Simulation pauses at the point where the action is needed. The python script detects the pause and collects parameters from Simulink as observation. The observation is used to get an action from RL agent, and the action is updated in Simulink workspace. The simulation is then resumed. This process is repeated for the training.
Challenge: While this setup works, the pausing of the simulation and waiting in python is time consuming. It takes more than 5 hours to even run 200 simulations which is inefficient for training an RL agent.
Is there a better and faster approach for this use case?