Does the choice of activation function depend on value input range?

Question

I am currently working with audio data and an autoencoder.

The input data goes from [-1 to 1], same has to be true for output data [-1 to 1]

So, as to help the network retain values between -1 and 1 throught, I'm using Tanh() activation functions to introduce nonlinearity.(This is to retain the "representation" of the sound throughout the whole network).

I was wondering that, if i biased my data to [0 to 2], and then scaled to [0 to 1], if I could also use ReLu functions? (as they are linear between 0 and 1, thus not creating nonlinearities?)

In general, would there be an improvement/reason to bias+normalize my data? Also, are ReLu 'better' than tanh functions, or are they just faster to calculate?

score 0 · Answer 1 · answered Jun 08 '23 at 19:24

You can use any activation functions you want in the hidden layers. As long as the final output layer uses tanh then the output would be in the range of [-1 , 1]. In this sense then yes, you can use ReLU.

As for the biased input, there would be a learnable weights for biases for each layer if not explicitly specified in most of the popular ML frameworks.

As with many things in this field there is no absolutes. I would say test them all out and use the ones that you find it to work best.

Does the choice of activation function depend on value input range?

1 Answers1