-1

I'm building a neural network for classifying mnist digits I got the data from I tried to built it with only tensorflow I didn't want to use keras https://www.kaggle.com/c/digit-recognizer/data and through epochs cost is not decreasing

I followed that tutorial https://adventuresinmachinelearning.com/python-tensorflow-tutorial/ the difference between my code and his , is that I didn't import the data from tensorflow and I replaced deprecated functions note : I'm using anaconda and tensorflow==1.14

that is my code

import tensorflow as tf
import pandas as pd
# from tensorflow import keras
# import matplotlib.pyplot as plt
import numpy as np

tr_csv = "file:///D:/code/ml/neuralnet/train.csv"
data = pd.read_csv(tr_csv)

data  = data[:30000]
train = data[:24000]
test = data[24000:]

learning_rate = 1
epochs = 10
batch_size = 100

xtf = tf.compat.v1.placeholder(tf.float32, [None, 784])
ytf = tf.compat.v1.placeholder(tf.float32, [None, 10])

W1 = tf.Variable(tf.random.normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random.normal([300]), name='b1')
W2 = tf.Variable(tf.random.normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random.normal([10]), name='b2')

hidden1 = tf.add(tf.matmul(xtf,W1),b1)
hidden1 = tf.nn.relu(hidden1)

y_ = tf.nn.softmax(tf.add(tf.matmul(hidden1,W2),b2))
yclipped =  tf.clip_by_value(y_, 1e-10, 0.9999999)

cross_entropy = -tf.reduce_mean(tf.reduce_sum(ytf * tf.log(yclipped)+ (1 - ytf) * tf.log(1 - yclipped), axis=1))
optimiser = tf.compat.v1.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(ytf, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

with tf.compat.v1.Session() as sess :
    init_op = tf.compat.v1.global_variables_initializer()
    sess.run(init_op)
    total_batch = int(len(train.label) / batch_size)
    for epoch in range(10):
        avg_cost = 0
        for chunk in np.array_split(train, 24000/batch_size):
   #     for i in range(total_batch)
            xb = chunk.drop(["label"],axis = 1 )
            yb = chunk.label
            yb = pd.Categorical(yb)
            yb = pd.get_dummies(yb,prefix = "number")
            #(batch_size = batch_size) x

            (_, c) = sess.run([optimiser, cross_entropy],feed_dict={xtf : xb.to_numpy() ,ytf: yb.to_numpy()})

            avg_cost += c / batch_size
        print("Epoch:", (epoch + 1), "cost =", "{:.3f}".format(avg_cost))
        #print(sess.run(accuracy, feed_dict={x: xt., y: mnist.test.labels}))

that is cost through epochs



Epoch: 1 cost = 84.110
Epoch: 2 cost = 84.109
Epoch: 3 cost = 84.109
Epoch: 4 cost = 84.109
Epoch: 5 cost = 84.109
Epoch: 6 cost = 84.109
Epoch: 7 cost = 84.109
Epoch: 8 cost = 84.109
Epoch: 9 cost = 84.109
Epoch: 10 cost = 84.109
Nader Atef
  • 11
  • 1

1 Answers1

0

For any neural network, parameter tuning is essential. You can try different combinations to come up with a suitable value. Cost function gives an idea as to whether convergence has reached or not. If it does not converge, try a new set of parameters.