Activation function is a non-linear transformation, usually applied in neural networks to the output of the linear or convolutional layer. Common activation functions: sigmoid, tanh, ReLU, etc.
Questions tagged [activation-function]
343 questions
2
votes
1 answer
How to define a modified leaky ReLU - TensorFlow
I would like to use the leaky-ReLu function with minimization rather than maximization as my activation for a dense layer. In other words, I want my activation to be f(x) = min{x, \alpha x }. I first define a method as shown below.
def…

sergey_208
- 614
- 3
- 21
2
votes
1 answer
Where is the "negative" slope in a LeakyReLU?
What does the negative slope in a LeakyReLU function refer to?
The term "negative slope" is used in the documentation of both TensorFlow and Pytorch, but it does not seem to point to reality.
The slope of a LeakyReLU function for both positive and…

user25004
- 1,868
- 1
- 22
- 47
2
votes
1 answer
Activation function of tf.math.pow(x, 0.5) leading to NaN losses
I'm trying to use a custom square root activation function for my Keras sequential model (specifically for the MNIST dataset). When I use tf.math.sqrt(x), training goes smoothly and the model is quite accurate. However, when I try using…

ag2718
- 101
- 2
- 9
2
votes
1 answer
Using a custom step activation function in Keras results in “'tuple' object has no attribute '_keras_shape'” error. How to resolve this?
I'm trying to implement a binary custom activation function in the output layer of a Keras model.
This is my trial:
def binary_activation(x):
ones = tf.ones(tf.shape(x), dtype=x.dtype.base_dtype)
zeros = tf.zeros(tf.shape(x),…

Marlon Teixeira
- 334
- 1
- 14
2
votes
0 answers
Why is it discouraged to use softmax as activation function in last layer according to Tensorflow documenation?
I was following Tensorlflow's Quickstart guide and noticed they discouraged using the softmax function as the activation function in the last layer. The explanation follows:
While this can make the model output more directly interpretable, this…

Bryan Conklin
- 21
- 2
2
votes
1 answer
Making custom activation function in tensorflow 2.0
I am trying to create a custom tanh() activation function in tensorflow to work with a particular output range that I want. I want my network to output concentration multipliers, so I figured if the output of tanh() were negative it should return a…

Dale Larie
- 33
- 1
- 7
2
votes
0 answers
proper activation function at output and loss function to optimize for OCR?
I am trying to make a CNN model on IAM handwritten words data(which has images of words handwritten by multiple people and targets are text in the images). So, I can encode words to numbers(A=0, B=1 and so on for capital, small and punctuation).…

Naveen Reddy Marthala
- 2,622
- 4
- 35
- 67
2
votes
3 answers
Confusion about sigmoid derivative's input in backpropagation
When using the chain rule to calculate the slope of the cost function relative to the weights at the layer L , the formula becomes:
d C0 / d W(L) = ... . d a(L) / d z(L) . ...
With :
z (L) being the induced local field : z (L) = w1(L) * a1(L-1) +…

EEAH
- 715
- 4
- 17
2
votes
3 answers
Why is the main purpose of ResNet if the vanishing gradient problem is solved using RELU activation function?
I read that ResNet solves the problem of vanishing gradient problem by using skip functions. But are they not already solved using RELU? Is there some other important thing I'm missing about ResNet or does Vanishing gradient problem occur even after…

krishna prasad
- 31
- 1
- 4
2
votes
2 answers
How to obtain data of filters of the convolution layers of the CNN network in DL4J to draw the activation map?
How to get filters data from the layer objects for the configuration and model like this?
ComputationGraphConfiguration config =
new NeuralNetConfiguration.Builder()
.seed(seed)
…

Eljah
- 4,188
- 4
- 41
- 85
2
votes
1 answer
Why does the SELU activation function preserve mean 0?
From Aurelien Geron's book "Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow", p. 337:
"The authors showed that if you build a neural network composed exclusively of a stack of dense layers, and if all hidden layers use the SELU…

karu
- 465
- 3
- 12
2
votes
1 answer
Periodic activation functions
Why periodic functions like sin(x), cos(x) are not used as activation functions in a neural network?
relu = max(0, f(x)) is used
But
f(x) = sin(x) is not used

Dhaval Taunk
- 1,662
- 1
- 9
- 17
2
votes
0 answers
Does higher activation value mean a neuron is important in neural networks?
Suppose I have a deep neural network for classification purposes, which consists of two hidden layers, 50 neurons in each layer and "Relu" as the activation function. The output of the activation function ranges from 0 to +1 in my model. Now, is…

Sakib Mostafa
- 43
- 1
- 9
2
votes
1 answer
Autoencoder for Tabular Data with Discrete Values
I want to use an autoencoder for dimension reduction in Keras. The input is a table with discrete values 0,1,2,3,4 (each of these numbers show a category) in the columns. Each subject has a label 0/1 to show sick/healthy. Now I have two…

Sasha
- 21
- 1
2
votes
1 answer
Custom keras activation function for different neurons
I have a custom keras layer and I have to create my custom activation function. Is it possible to put fixed activations for different neuron in the same layer?
For example, let's say I have something like a Dense Layer with 3 units, and I want that…

solopiu
- 718
- 1
- 9
- 28