Using a custom step activation function in Keras results in "An operation has `None` for gradient." error. How to resolve this?

Question

I am building auto-encoder and I want to encode my values into a logical matrix. However, when I'm using my custom step activation function in one of the intermediate layers (all other layers are using 'relu'), keras raises this error:

An operation has `None` for gradient.

I've tried using hard-sigmoid function, but it doesn't fit my problem, because it still produces intermediate values, when I only need binary. I am aware, that at most points my function has no gradient, but is it possible to use some other function for gradient calculation and still use step function for accuracy and loss calculations?

My activation function:

def binary_activation(x):
    ones = tf.ones(tf.shape(x), dtype=x.dtype.base_dtype)
    zeros = tf.zeros(tf.shape(x), dtype=x.dtype.base_dtype)
    return keras.backend.switch(x > 0.5, ones, zeros)

I expect to be able to use binary step activation function to train the network and then use it as a typical auto-encoder. Something simmilar to binary feature map used in this paper.

score 2 · Accepted Answer · answered Feb 11 '19 at 15:10

2

As mentioned here, you could use tf.custom_gradient to define a "back-propagatable" gradient for your activation function.

Perhaps something like:

@tf.custom_gradient
def binary_activation(x):

    ones = tf.ones(tf.shape(x), dtype=x.dtype.base_dtype)
    zeros = tf.zeros(tf.shape(x), dtype=x.dtype.base_dtype)

    def grad(dy):
        return ...  # TODO define gradient
  return keras.backend.switch(x > 0.5, ones, zeros), grad

answered Feb 11 '19 at 15:10

mrks

513
3
7

Where `grad` can simply return `dy`, as a linear function. – Daniel Möller Feb 11 '19 at 15:27
1

Depending on special cases, this might make some weight matrix increase/decrease indefinitely, eventually hitting a machine limit. In these cases, perhaps a "normal" shaped gradient (similar to the gradient of 'sigmoid') would be an option. – Daniel Möller Feb 11 '19 at 15:29
This definition gives me exactly what I'm looking for. However, I'm getting an error: AttributeError: 'tuple' object has no attribute '_keras_shape' I'm using grad as suggested by @DanielMöller. – Marlon Teixeira Feb 12 '21 at 15:47

Using a custom step activation function in Keras results in "An operation has `None` for gradient." error. How to resolve this?

1 Answers1

Linked