0

I'm relatively new to TensorFlow, and I was trying to play around with the MNIST dataset.

This is the code I have, but for some reason the epoch-cost increases with each iteration. I tried changing the learning rate, number of layers, and neurons but the trend is consistently upwards.

It would be great if someone can help me out.

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('/tmp/data/',one_hot = True)

def NN(x):
    layer1 = 10
    layer2 = 10
    inps = 28*28
    outs = 10

    w1 = tf.Variable(np.random.randn(layer1, inps))
    w2 = tf.Variable(np.random.randn(layer2, layer1))
    w3 = tf.Variable(np.random.randn(outs, layer2))

    l1 = tf.matmul(w1,x)
    l1 = tf.nn.relu(l1)

    l2 = tf.matmul(w2,l1)
    l2 = tf.nn.relu(l2)

    l3 = tf.matmul(w3, l2)

    return l3


x = tf.placeholder(tf.float64, [28*28, None])
y = tf.placeholder(tf.int64, [10, None])
predic = NN(x)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = predic,labels = y))
optimizer = tf.train.AdamOptimizer().minimize(cost)

batch_size = 512
epoch = 5

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for e in range(epoch):
        e_cost = 0
        for b in range(0,int(mnist.train.num_examples/batch_size)):
            x1, y1 = mnist.train.next_batch(batch_size)
            c,_ = sess.run([cost, optimizer], feed_dict = {x: x1.T, y: y1.T})
            e_cost += c
        print("Epoch Cost: ", e_cost)

The output looks like this

Epoch Cost:  485846.36608997884
Epoch Cost:  1133384.4635202957
Epoch Cost:  3738400.689635882
Epoch Cost:  9999002.612394715
Epoch Cost:  22214906.41488508
Abhinav Goel
  • 401
  • 1
  • 6
  • 13

1 Answers1

1

I've figured it out.

The function:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = predic,labels = y))

requires the logits and the labels to be matrix to be of the shape: (batch_size, num_outputs). I had to transpose the matrices to obtain the right result.

The corrected function:

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits = tf.transpose(predic), labels = tf.transpose(y)))
Abhinav Goel
  • 401
  • 1
  • 6
  • 13