0

Let me first describe the setup: We have an autonomous agent in Unity, whose decisions are based on the perceived environment(level) and some pre-defined parameters for value mapping. Our aim is to pre-train the agents' parameters in a DNN. So the idea is basically to define an error metric to evaluate the performance of the agent in a Unity simulation (run the level, e.g. measure the deviation from the optimal trajectory = ground truth in unity). So based on the input level into the DNN, the network should train to output the params, the simulation is performed and the error is passed back to the network as the error value, like accuracy, so the network could train based on that error/performance.

Is there any way to perform the evaluation(comparison to the ground truth) during the training outside of Keras? Usually, one passes X data to the network, train stuff and compare it to the ground truth Y. This works fine for predictions, but I don't want to predict something. What I do want is to measure the deviation from the ground truth inside the simulation. I know there is Unity ML Agents, but as far as I could read, the 'brain' controls the agent on runtime, i.e. update it on every frame and control the movement. What I want is to perform the whole simulation to update the params/weights of the network.

Best wishes.

Mike Wise
  • 22,131
  • 8
  • 81
  • 104
00zetti
  • 114
  • 8
  • Ground truth of what exactly? Value? Unobserved state features? You say you don't (or "don't want to") predict anything (although you are estimating value) so it's unclear exactly what you're trying to compare between ground truth and estimate. – Ruzihm Oct 25 '18 at 19:14
  • Assuming it is related to omniscient state information: If you already have a channel of communication from Unity to Keras that sends observed state and receives actions, the simplest way would be to *also* pass whatever omniscient state features you're interested in along with the observed state over to Keras, and have Keras only use the observed state information as NN input--the omniscient info only being used as part of error calculation. – Ruzihm Oct 25 '18 at 19:24
  • The ground truth in the simulation should be the perfect trajectory, so as the error I want to measure e.g. the deviation of the simulated trajectory from the gt. Therefore I need to run the simulation with the output parameters for the agent from the training, measure the error and return the error back to the network. The 'ground truth' (Y data) in the network be 0 (as there is no deviation from the perfect trajectory), at least this i the setup I have in mind. – 00zetti Oct 25 '18 at 19:26

1 Answers1

0

After some talks at my university: the setup won't work this way since I need to split the process.

I need the parameters of working agents to train the network based only on the level description(e.g. matrix like video game description language). To obtain the parametrized agents based on the actual level and the ground truth data(e.g. deviation from trajectory), one need to use reinforcement deep learning with a score function to obtain these parameters. Therefore Unity ML Agents might be useful. Afterwards, I can use the parameters settings and the correlating level data to train a network to yield the desired parameters based only on the level description.

00zetti
  • 114
  • 8