Unity/ml-agents, calling with right argument for multiple brains

Question

Playing with This. I can successfully train 1 brain, but when I want to train 2 brains I get following error

---------------------------------------------------------------------------
UnityActionException                      Traceback (most recent call last)
<ipython-input-4-520c26ebec47> in <module>()
     48 
     49 
---> 50         new_info = trainer.take_action(info, env, brain_name)
     51 
     52 

C:\UNITY\ml-agents-master\python\ppo\trainer.py in take_action(self, info, env, brain_name)
     51         self.stats['value_estimate'].append(value)
     52         self.stats['entropy'].append(ent)
---> 53         new_info = env.step(actions, value={brain_name: value})[brain_name]
     54         self.add_experiences(info, new_info, epsi, actions, a_dist, value)
     55         return new_info

C:\UNITY\ml-agents-master\python\unityagents\environment.py in step(self, action, memory, value)
    288                     raise UnityActionException(
    289                         "You have {0} brains, you need to feed a dictionary of brain names a keys, "
--> 290                         "and actions as values".format(self._num_brains))
    291                 else:
    292                     action = {self._brain_names[0]: action}

UnityActionException: You have 2 brains, you need to feed a dictionary of brain names a keys, and actions as values

Here I found this part, relevant to my problem:

Step : env.step(action, memory=None, value = None)

Sends a step signal to the environment using the actions. Note that if you have more than one brain in the environment, you must provide a dictionary from brain names to actions.

action can be one dimensional arrays or two dimensional arrays if you have multiple agents per brains.

memory is an optional input that can be used to send a list of floats per agents to be retrieved at the next step.

value is an optional input that be used to send a single float per agent to be displayed if and AgentMonitor.cs component is attached to the agent. Returns a dictionary mapping brain names to BrainInfo objects.

But I am not sure how to interpret it. Can someone suggest how should I construct it so I can use 2 brains in my environment?

Thanks!

score 0 · Answer 1 · edited Sep 27 '17 at 21:44

0

If you have only one brain, you can input a list or a numpy array of floats.

If you have multiple brains, you must send a dictionary.

Example: if you have 2 brains ( 'brain1' and 'brain2') each has one agent that takes 2 continuous actions, you must call: env.step({'brain1':[0.1,0.2], 'brain2':[0.3,0.4]})

edited Sep 27 '17 at 21:44

pacman

1,061
1
17
36

answered Sep 27 '17 at 20:45

Vincent-Pierre

1

Still errors:( 43 for name in env.brain_names: ---> 44 new_info = trainer.take_action(info, env, name) ...\trainer.py in take_action(self, info, env, brain_name) 53 for name in brain_name: ---> 54 new_info = env.step(action={name: actions}, value={name: value})[name] ...\environment.py in step(self, action, memory, value) 310 if b not in action: --> 311 raise UnityActionException UnityActionException: You need to input an action for the brain Ball3DBrain – Maximus Sep 28 '17 at 09:23
What is `name` set to ? you should have `new_info = env.step(action={name: actions, 'Ball3DBrain':other_actions}, value={name: value})[name]` instead of `new_info = env.step(action={name: actions}, value={name: value})[name] ` I suggest you post an issue on the [github page](https://github.com/Unity-Technologies/ml-agents/issues) to get faster support. By the way, there is currently no support for ppo with multiple brains. If you have more that one brain, you will have to either train them separately or implement your own custom neural network(s). – Vincent-Pierre Sep 28 '17 at 20:44
Ah!, this clears things up! Will try when I get home. For training, I preffer to call "train" for each train explicitly, at different steps in enviroment, each brain having its own neural network, effectively mirroring the first one (and using GA to slowly diverge nerual networks over time) – Maximus Sep 30 '17 at 08:14

Unity/ml-agents, calling with right argument for multiple brains

1 Answers1