1

I have a class with a model inside:

class MyNetwork:

    def __init__(self):
        # layer initializations
        self.optimizer = tf.train.RMSPropOptimizer(0.001)
        self.loss = tf.losses.sigmoid_cross_entropy

    def build(self):
        # layer connections

        self.model = keras.Model(inputs=[inputs], outputs=[outputs]
        return self.model

    @tf.function
    def train_step(self, images, labels):
        with tf.GradientTape() as tape:
            predictions = self.model(images)
            loss = self.loss(labels, predictions)

        gradients = tape.gradient(loss, self.model.trainable_variables)
        self.optimizer.apply_gradients(zip(gradients, self.model.trainable_variables))
        return loss, predictions

I use the following to build the model:

network = MyNetwork()
model = network.build()

When training with the following lines:

model.compile(tf.train.RMSPropOptimizer(0.001), loss=tf.losses.sigmoid_cross_entropy, metrics=['accuracy']
model.fit(X, y, epochs=10)

The model trains without any issues.

But in a separate run, using the following code:

for i in range(10):
    print("Epoch ", (i))

    loss, pred = D.train_step(X, y)
    print(loss)

The loss gets stuck after a a few epochs. The calculated accuracy also stops at 0.5.

Does anyone know how a Keras model can be trained using GradientTape?

Susmit Agrawal
  • 3,649
  • 2
  • 13
  • 29
  • Isn't the `optimizer` and `loss` you used in `train_step` are different from those passed to `model.compile()`?! Did you try to use for example `model.compile(network.optimizer, network.loss, metrics=['accuracy'])`? – Yahya Dec 23 '19 at 11:34

0 Answers0