I am building auto-encoder and I want to encode my values into a logical matrix. However, when I'm using my custom step activation function in one of the intermediate layers (all other layers are using 'relu'), keras raises this error:
An operation has `None` for gradient.
I've tried using hard-sigmoid function, but it doesn't fit my problem, because it still produces intermediate values, when I only need binary. I am aware, that at most points my function has no gradient, but is it possible to use some other function for gradient calculation and still use step function for accuracy and loss calculations?
My activation function:
def binary_activation(x):
ones = tf.ones(tf.shape(x), dtype=x.dtype.base_dtype)
zeros = tf.zeros(tf.shape(x), dtype=x.dtype.base_dtype)
return keras.backend.switch(x > 0.5, ones, zeros)
I expect to be able to use binary step activation function to train the network and then use it as a typical auto-encoder. Something simmilar to binary feature map used in this paper.