I am making a LSTM network where output is in the form of One-hot encoded directions Left, Right, Up and Down. Which comes out to be like:
[0. 0. 1. 0.]
[1. 0. 0. 0.]
[0. 0. 1. 0.]
...
[0. 0. 1. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 0.]
What should be the acceptable range of Categorical Cross-Entropy loss, in order to consider the model successfully trained?