Training a basic TensorFlow Model using the GradientTape

Question

Simply for education purposes, I was trying to build upon the Basic training loops tutorial from the TensorFlow homepage to create a simple neural network that classifies points in the plane.

So, I have some points in [0,1]x[0,1] stored in a tensor x of shape (250, 2, 1) and the corresponding labels (1. or 0.) stored in a tensor y of shape (250,1,1). Then I do

import tensorflow as tf

w0 = tf.Variable(tf.random.normal([4,2]), name = 'w0')
w1 = tf.Variable(tf.random.normal([1,4]), name = 'w1')
b1 = tf.Variable(tf.zeros([4,1]), name = 'b1')
b2 = tf.Variable(tf.zeros([1,1]), name = 'b2')

loss = tf.keras.losses.CategoricalCrossentropy()

def forward(x):
  x0 = x
  z1 = tf.matmul(w0, x0) + b1
  x1 = tf.nn.relu(z1)
  z2 = tf.matmul(w1, x1) + b2
  x2 = tf.nn.sigmoid(z2)
  return x2

with tf.GradientTape() as t:
    current_loss = loss(y, forward(x))

gradients = t.gradient(current_loss, [b1, b2, w0, w1])

What I get is a list of tensors of the expected shape but only containing zeros. Anyone some advice?

Better zero than `None`! What did you expect? – Innat Apr 01 '21 at 11:10 — Innat, Apr 01 '21 at 11:10

score 0 · Accepted Answer · answered Apr 01 '21 at 21:49

0

The issue happens because the labels/predictions do not have the expected shapes. In particular, the loss function tf.keras.losses.CategoricalCrossentropy expects labels to be provided in a one-hot representation, but your labels and predictions have shape (250, 1, 1) and the behaviour of the loss function is unclear in this situation. Using tf.keras.losses.BinaryCrossentropy instead should solve the problem.

answered Apr 01 '21 at 21:49

rvinas

11,824
36
58

1

Thank you so much. One-Hot encoded my labels and everything works out beautifully! – Tha_X Apr 03 '21 at 14:42

Training a basic TensorFlow Model using the GradientTape

1 Answers1