1

I am using Unity with ML-Agents and their PPO implementation.

I have one Action to train my neural network on, which has an Imput of -1 to 1. When I log the action I can see that the Network always tries values like 550, 630,-530 etc. How can I limit these to only use values between -1 and 1?

I tried to look in Unity for it. Couldn't find any option. Now I am trying to modify the PPO algorithm, but I cannot find anything to limit my values.

My logging works like this: My Agent has the AgentStep method:

public override void AgentStep(float[] act){
  if (brain.brainParameters.actionSpaceType == StateType.continuous) {
    var actionAC = act[0];
    float[] toLog = new float[2];
    object.move(actionAC);
    // some rewards including toLog[0] as reward log
    toLog[1] = actionAC;
    logger.AddLine(toLog);
  }
}

Logger is a class written by me to just create a csv file. This output looks than like:

-1 530.73106
-2 530.73106
...
-234.5 -631.9137
...

thanks in advance.

Mike Wise
  • 22,131
  • 8
  • 81
  • 104
ChrizZlyBear
  • 121
  • 2
  • 11

1 Answers1

1

Try var actionAC = Mathf.Clamp(act[0], -1, 1);

This assures that the value of actionAC is always between -1 and 1.

https://docs.unity3d.com/ScriptReference/Mathf.Clamp.html

Noodles
  • 3,888
  • 2
  • 20
  • 31