I am using Unity with ML-Agents and their PPO implementation.
I have one Action to train my neural network on, which has an Imput of -1 to 1. When I log the action I can see that the Network always tries values like 550, 630,-530 etc. How can I limit these to only use values between -1 and 1?
I tried to look in Unity for it. Couldn't find any option. Now I am trying to modify the PPO algorithm, but I cannot find anything to limit my values.
My logging works like this: My Agent has the AgentStep method:
public override void AgentStep(float[] act){
if (brain.brainParameters.actionSpaceType == StateType.continuous) {
var actionAC = act[0];
float[] toLog = new float[2];
object.move(actionAC);
// some rewards including toLog[0] as reward log
toLog[1] = actionAC;
logger.AddLine(toLog);
}
}
Logger is a class written by me to just create a csv file. This output looks than like:
-1 530.73106
-2 530.73106
...
-234.5 -631.9137
...
thanks in advance.