2

I'm trying to use a custom square root activation function for my Keras sequential model (specifically for the MNIST dataset). When I use tf.math.sqrt(x), training goes smoothly and the model is quite accurate. However, when I try using tf.math.pow(x, 0.5), the model fails to train and the losses go to NaN.

I'm really unsure why this is happening because I would think that the two alternatives are identical.

Square root function

def tfsqrt(x):
    cond = tf.greater_equal(x, 0)
    return tf.where(cond, tf.math.sqrt(x), -tf.math.sqrt(-x))

Power function

def pwsqrt(x):
  cond = tf.greater_equal(x, 0)
  return tf.where(cond, tf.math.pow(x, 0.5), -tf.math.pow(-x, 0.5))

If anybody could explain this unexpected behavior, that would be much appreciated. Thanks!

ag2718
  • 101
  • 2
  • 9

1 Answers1

0

Functions are correct: x=tf.Variable([-2.0,-3.0,0.0, 1.0,2.0])

y=tfsqrt(x)
y
y=pwsqrt(x)
y

The functions work fine in google colab, maybe there are some nan values in data.

Maybe there is some problem in model loss or metric.

Peter Pirog
  • 142
  • 1
  • 8