I am trying to use a custom loss function (critical success index) with my simple CNN (for 64x64 pixel images) in TensorFlow I am getting a list of Nones for the gradient.
Here is the custom loss function:
from keras import backend as K
@tf.function
def custom_csi_loss(y_true, y_pred):
# Define the target class
target_class = 1
# Calculate the true positives, false positives, and false negatives
true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
false_positives = K.sum(K.round(K.clip(y_pred - y_true, 0, 1)))
false_negatives = K.sum(K.round(K.clip(y_true - y_pred, 0, 1)))
# Calculate the CSI
csi = true_positives / (true_positives + false_negatives + false_positives)
# Return the negative of the CSI as the loss (since we want to minimize the loss)
return -csi
and here is the model:
def build_scnn(shape=(128, 128, 3), k_init="he_normal", dilation_rate=(1, 1), dtype=tf.float32):
inputs = Input(shape=shape)
normalized = BatchNormalization(axis=3)(inputs)
x = Conv2D(64, 3, padding="same", activation="relu", kernel_initializer=k_init)(normalized)
x = Conv2D(128, 3, padding="same", activation="relu", dilation_rate=dilation_rate, kernel_initializer=k_init)(x)
x = Conv2D(128, 3, padding="same", activation="relu", kernel_initializer=k_init)(x)
outputs = Conv2D(1, 1, padding="same", activation="sigmoid", dtype=dtype)(x)
outputs = Reshape((64 * 64, 1))(outputs)
scnn = Model(inputs, outputs, name="SCNN")
return scnn
scnn = build_scnn(shape=(64, 64, len(gdf[features].columns)),
k_init=k_init,
dilation_rate=dilation_rate)
This is the training step:
@tf.function
def train_step(x, y):
with tf.GradientTape(watch_accessed_variables=True) as tape:
tape.watch(scnn.trainable_variables)
y_pred = scnn(x, training=True)
loss = loss_fn(y, y_pred)
gradients = tape.gradient(loss, scnn.trainable_variables) # differentiate loss wrt scnn weights
print(f"gradients: {gradients}")
optimizer.apply_gradients(zip(gradients, scnn.trainable_variables))
return loss, y_pred
and here is the main body of the code:
for epoch in range(epochs):
epoch_loss = 0
epoch_csi = 0
num_batches = 0
for x, y, w in train.map(weight_func):
y = tf.cast(y, dtype=tf.float32)
loss, y_pred = train_step(x, y)
epoch_loss += loss
epoch_csi += metrics[0](y, y_pred)
num_batches += 1
epoch_loss /= num_batches
epoch_csi /= num_batches
The gradients
variable is always [None, None, None, ...]
and the rest of the code fails. The code works with keras.losses.BinaryCrossentropy and other binary loss functions so as far as I can tell it can only be an issue with the custom_csi_loss
function. I have checked the size and data types of y and y_pred and they are consistent.
This question resembles the following but they didn't solve my problem:
- Tensorflow 2.0 doesn't compute the gradient
- How do I properly use GradientTape to make a custom loss function in TensorFlow?
(Please help!)