"An operation has `None` for gradient" problem with custom loss function

Question

I seem to have a specific code because I couldn't find what I want on the web, so here's my problem:

I have coded a NN that takes an array of a specific length and should give me a singe value as output:

    model = tf.keras.Sequential()
    model.add(layers.Embedding(input_dim=int(input_len_array), output_dim=8 * int(input_len_array)))

    model.add(layers.GRU(32 * int(input_len_array), return_sequences=True))

    # Last layer...
    model.add(layers.Dense(1, activation='tanh'))

After that I create a Custom_loss function:

    custom_loss(x_, y_):
        sess = tf.Session()
        Sortino = self.__backtest(x_, y_)

        def loss(y_true, y_pred):
            print('Sortino: ', Sortino)

            # The Optimizer will MAXIMIZE the Sortino so we compute -Sortino
            return tf.convert_to_tensor(-Sortino)

        return loss

After that I compile my model and I give it the whole batch of values in the tensor X and Y:

self.model.compile(optimizer='adam', loss=custom_loss(x, y))

Inside the Custom loss I call the function self.__backtest which is defined as below:

def __backtest(self, x_: tf.Tensor, y_r: tf.Tensor, timesteps=40):     
    my_list = []
    sess = tf.Session()

    # Defining the Encoder
    # enc = OneHotEncoder(handle_unknown='ignore')
    # X = [[-1, 0], [0, 1], [1, 2]]
    # enc.fit(X)

    # sess.run(x_)[i, :] is <class 'numpy.ndarray'>
    print('in backest: int(x_.get_shape())', x_.get_shape())

    for i in range(int(x_.get_shape()[0])):
        output_of_nn = self.model.predict(sess.run(x_)[i, :] / np.linalg.norm(sess.run(x_)[i, :]))

        # categorical_output = tf.keras.utils.to_categorical(output_of_nn)

        my_list.append(scaled_output * sess.run(y_r)[i])
        if i < 10:
            print('First 10 scaled output: ', scaled_output)
        if i > 0:
            capital_evolution.append(capital_evolution[-1] * (my_list[-1] + 1))

    my_array = np.array(my_list)

    if len(my_array) < 10:
        return -10

    try:
        Sortino = my_array.mean() / my_array.std()
    except:
        Sortino = -10

    return Sortino

The computer is'nt able to run the code and gives me this error:

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

I would be more that grateful if someone could give the solution!! MANY THANKS!!

hey i have the same error at the moment, its normally about, using methods that are not differentiable. So for example if you build a custom layer, that has a for loop in it, isnt differentiable, so you cant pass the a gradient to a layer before it. Thats what the error is mainly about. i dont think that it has something to do with your loss function, because the loss function dont calculate gradients, but merely a difference between input and output. Did this help you somehow ? — Paul Higazi, Apr 16 '20 at 21:16
Hi Paul, That helped me somehow because I didn't know a for loop could be a problem but now that you tell me it seems logical: it's clearly not differentiable. Now I "just" have to find an alternative for the loop ^^ .... I was thinking... in Numerical Analysis we can approximate the gradient of a function if we can compute the function CONTINUOUSLY (say at X and at X+dx )... maybe that's what you are talking about. It may be a good clue to try to define the gradient of the for loop somehow... Or maybe there's an optimizer that doesn't require gradients? Maybe optimize genetically? — Bogdan Secas, Apr 17 '20 at 12:53
yeah genetic algorythms dont need gradients, maybe, actually its up to you how you build up a genetic algorythm. For example you can use groups of CNNs, train them in parallel with stochastic gradient descend and after 10 epochs you use an genetic algorythm to determine the fitness, but thats quite hard to compute and takes a lot of time. Actually you should think about vectorization, because you can compute a lot of loops, as simple vector algebra, just like Cuda kernel (if you heard of it somehow). Tensorflow can help you there a lot. Try to use the lambda layer of keras to wrapp the algebra — Paul Higazi, Apr 17 '20 at 13:03
see this post in Stack overflow: https://stackoverflow.com/questions/60198888/keras-custom-layer-valueerror-an-operation-has-none-for-gradient, actually just the frist answer ^^ — Paul Higazi, Apr 17 '20 at 13:05

"An operation has `None` for gradient" problem with custom loss function

0 Answers0