0

I am trying to build a neural network for linear rgression. I want to add regularization part to the cost function but the cost does not change after each iteration. The code is as follows:

X = tf.placeholder(tf.float32,[n_x, None], name = "x")
Y = tf.placeholder(tf.float32,[n_y, None], name = "y")
W1 = tf.get_variable("W1", [25,11], initializer = tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [25,1], initializer = tf.zeros_initializer())
W2 = tf.get_variable("W2", [25,25], initializer = tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [25,1], initializer = tf.zeros_initializer())
W3 = tf.get_variable("W3", [25,25], initializer = tf.contrib.layers.xavier_initializer())
b3 = tf.get_variable("b3", [25,1], initializer = tf.zeros_initializer())
W4 = tf.get_variable("W4", [25,25], initializer = tf.contrib.layers.xavier_initializer())
b4 = tf.get_variable("b4", [25,1], initializer = tf.zeros_initializer())
W5 = tf.get_variable("W5", [12,25], initializer = tf.contrib.layers.xavier_initializer())
b5 = tf.get_variable("b5", [12,1], initializer = tf.zeros_initializer())
W6 = tf.get_variable("W6", [1,12], initializer = tf.contrib.layers.xavier_initializer())
b6 = tf.get_variable("b6", [1,1], initializer = tf.zeros_initializer())
Z1 = tf.add(tf.matmul(W1,X),b1)                                            
A1 = tf.nn.relu(Z1)                                            
Z2 = tf.add(tf.matmul(W2,A1),b2)                                              

A2 = tf.nn.relu(Z2)                                             
Z3 = tf.add(tf.matmul(W3,A2),b3)                                              

A3 = tf.nn.relu(Z3)                                          
Z4 = tf.add(tf.matmul(W4,A3),b4)                                              

A4 = tf.nn.relu(Z4)                                              
Z5 = tf.add(tf.matmul(W5,A4),b5)                                              

A5 = tf.nn.relu(Z5)
Z6 = tf.add(tf.matmul(W6,A5),b6)
A6 = tf.nn.tanh(Z6)      
regulaizers = tf.nn.l2_loss(W1) + tf.nn.l2_loss(W2) + tf.nn.l2_loss(W3) + tf.nn.l2_loss(W4) + tf.nn.l2_loss(W5) + tf.nn.l2_loss(W6)
beta=0.01
cost = (1/(2*m))*tf.reduce_sum(tf.pow(A6-Y, 2) ) + beta * regulaizers
optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)
epoch=1500
init = tf.global_variables_initializer()


with tf.Session() as sess:


      sess.run(init)


      for epoch in range(num_epochs):

          epoch_cost = 0.                     
          num_minibatches = int(m / minibatch_size) 
          minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)

          for minibatch in minibatches:


              (minibatch_X, minibatch_Y) = minibatch


              _ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})


             epoch_cost += minibatch_cost / num_minibatches

After runnining the script through epochs, the output is

Cost after epoch 0: 4.524896

Cost after epoch 100: 0.000041
Cost after epoch 200: 0.000041
Cost after epoch 300: 0.000041
Cost after epoch 400: 0.000041
Cost after epoch 500: 0.000041
Cost after epoch 600: 0.000041
Cost after epoch 700: 0.000041
Cost after epoch 800: 0.000041
Cost after epoch 900: 0.000041
Cost after epoch 1000: 0.000041
Cost after epoch 1100: 0.000041
Cost after epoch 1200: 0.000041
Cost after epoch 1300: 0.000041
Cost after epoch 1400: 0.000041
Cost after epoch 1500: 0.000041
Cost after epoch 1600: 0.000041
Cost after epoch 1700: 0.000041
Cost after epoch 1800: 0.000041
Cost after epoch 1900: 0.000041

After initialising the parameter and feed the session, As you can see the cost does not change. I was wondering if I can get some help and if the cost function is correct.

Amr Gaballah
  • 47
  • 1
  • 7
  • Could you show the code you use to run the optimizer? – pfm Feb 18 '18 at 07:09
  • @Nicolas I added this code to the original code – Amr Gaballah Feb 18 '18 at 20:09
  • What is the reason you expect the loss to continue changing? If you don't have a lot of examples, your network might have reached its local minimum (0.00041 seems like pretty low), by 100 epochs. You regularization seems fine at first glance. – iga Feb 20 '18 at 06:08
  • @iga Do you think that the cost function up there is correct? – Amr Gaballah Feb 20 '18 at 18:01
  • 1
    It looks fine (though I would not use `tf.pow` to compute a square - just *). FYI, there are various common losses available in TF, e.g. https://www.tensorflow.org/api_docs/python/tf/losses/mean_squared_error – iga Feb 20 '18 at 19:12

0 Answers0