2

I am training a Binary detection architecture using TensorFlow 2.2 and Keras. Previously, I had this working if I loaded the data in the same script as the training of the model. However, when I use a larger dataset (x6 more samples, same ratio of positive to negative samples), I now get this set of errors (it ran for a few epochs 5-10 (I ran it multiple times) before giving this error):

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (dense_1/Sigmoid:0) = ] [[[nan][nan][nan]]...] [y (Cast_4/x:0) = ] [0]
     [[{{node assert_greater_equal/Assert/AssertGuard/else/_1/Assert}}]]
     [[gradient_tape/point_conv_fp_1/ScatterNd/_192]]
  (1) Invalid argument:  assertion failed: [predictions must be >= 0] [Condition x >= y did not hold element-wise:] [x (dense_1/Sigmoid:0) = ] [[[nan][nan][nan]]...] [y (Cast_4/x:0) = ] [0]
     [[{{node assert_greater_equal/Assert/AssertGuard/else/_1/Assert}}]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_14820]

Here is the architecture:

Architecture

And here is code related to the layer where the error appears:

# initialisation
..
# point_conv_sa layers
..
self.dense4 = keras.layers.Dense(128, activation=tf.nn.elu)
self.bn4 = keras.layers.BatchNormalization()
self.dropout4 = keras.layers.Dropout(0.5)

# This line corresponds to 'dense_1' in the image
self.dense_fin = keras.layers.Dense(self.num_classes, activation=tf.nn.sigmoid, bias_initializer=self.initial_bias)

# training step
..
# point_conv_fp layers
..
net = self.dense4(points)
net = self.bn4(net)
net = self.dropout4(net)

pred = self.dense_fin(net)

return pred

Does it have to do with the loss function that Im using. I used keras.losses.BinaryCrossentropy() and there was no problem for both small and large dataset. Then I changed to focal loss based on https://github.com/mkocabas/focal-loss-keras and it failed for the large dataset:

def focal_loss(gamma=2., alpha=.25):
    def focal_loss_fixed(y_true, y_pred):
        pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
        pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
        return -K.mean(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1)) - K.mean((1 - alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
    return focal_loss_fixed

....
model.compile(
optimizer=keras.optimizers.Adam(config['lr']),
loss = focal_loss(alpha=config['fl_alpha'], gamma=config['fl_gamma']),
metrics=[Precision(),
         Recall(), 
         AUC()]
)

Let me know if more information is needed.

Cheers

  • I think I know why (haven't tried it but at least in theory it makes sense). It might be because of if pt_1 = 0 or pt_0 = 1 then log(pt_1)/log(1=pt_0) would be NaN. and the solution would be to do something like: pt_1 = tf.math.maximum(pt_1, K.epsilon()) and as for K.log(1. - pt_0) it should be K.log(tf.math.maximum(1-pt_0, K.epsilon())) – user14397621 Oct 06 '20 at 04:30

1 Answers1

1

update to tensorflow version 2.10 should works find. https://github.com/keras-team/keras/issues/15715#issuecomment-1100795008

  • While this link might answer the question, if possible you should [edit] your answer to include the most important details that are relevant to the question in your answer. This can help prevent your answer becoming invalid if the link stops working, or the content significantly changes. – Hoppeduppeanut Mar 01 '23 at 04:27