Say you want to predict temperature changes based on some input data. Temperature changes are positive or negative scalars with a mean of zero. If only the direction matters one could just use tanh as an activation function in the output layer. But say for delta-temperatures predicting the magnitude of the change is also important, not just the sign.
How would you model this output. Tanh doesn't seem to be a good choice because it gives values between -1 and 1. And say temperature changes have a gaussian, or some other weird distribution, so hovering around the center quasi-linear domain of tanh(+-0) would be difficult to learn for a neural network. I'm worried that the sign would be good but the magnitude output would be useless.
How about having the network output one-hot vectors of length N, treat the argmax of this output vector as a temperature change on a pre-defined window. Say the window is -30 - +30 degrees, using N=60 long one-hot vector, if argmax( output )=45 that means the prediction is about 15 degrees.
I was actually not sure how to search for this topic.