I was playing around with TensorFlow and I was looking at the tutorial from:
https://github.com/aymericdamien/TensorFlow-Examples/tree/0.11/examples/3_NeuralNetworks
Because I did not want to do the MNINST database, I changed the script with some data I have created with 8000 training samples. The evaluation is done with 300 test samples. The output is a binary classification. Bear in mind that I just dived in Machine learning and that my knowledge is quite restricted for now.
The script works fine, however my cost is stuck at a very high value and does not converge to 0. First, is it normal? How can I improve this? Did I do something wrong? Second the accuracy is not very good either, is it due to the bad convergence? Maybe 8000 is not enough to train the model? or the value are too scattered to actually be able to get a better accuracy.
I found a similar problem here:
tensorflow deep neural network for regression always predict same results in one batch
but I do not understand why or how this problem applies to me.
Could someone help me please?
Here is what the output is:
Starting 1st session...
Epoch: 0001 cost= 39926820.730
and at the end I get:
Epoch: 0671 cost= 64.798
Epoch: 0681 cost= 64.794
Epoch: 0691 cost= 64.791
Optimization Finished!
Accuracy: 0.716621
The codes is as follow:
import tensorflow as tf
import pandas as pd
import numpy as np
import csv
inputData = pd.read_csv('./myInputDataNS.csv', header=None)
runData = pd.read_csv('./myTestDataNS.csv', header=None)
trX, trY = inputData.iloc[:, :7].values, inputData.iloc[:,7].values
temp = trY.shape
trY = trY.reshape(temp[0], 1)
trY = np.concatenate((1-trY, trY), axis=1)
teX, teY = runData.iloc[:, :7].values, runData.iloc[:, 7].values
temp = teY.shape
teY = teY.reshape(temp[0], 1)
teY = np.concatenate((1-teY, teY), axis=1)
# Parameters
learning_rate = 0.001
training_epochs = 700
batch_size = 100
display_step = 10
# Network Parameters
n_hidden_1 = 320
n_hidden_2 = 320
n_hidden_3 = 320
n_input = 7
n_classes = 2 # (0 or 1)
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
def multilayer_perceptron(x, weights, biases):
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.nn.relu(layer_1)
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.nn.relu(layer_2)
layer_3 = tf.add(tf.matmul(layer_2, weights['h3']), biases['b3'])
layer_3 = tf.nn.relu(layer_3)
out_layer = tf.matmul(layer_3, weights['out']) + biases['out']
return out_layer
weights = {
'h1': tf.Variable(tf.random_normal([len(trX[0]), n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
'h3': tf.Variable(tf.random_normal([n_hidden_3, n_hidden_3])),
'out': tf.Variable(tf.random_normal([n_hidden_3, n_classes]))
}
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'b3': tf.Variable(tf.random_normal([n_hidden_3])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
pred = multilayer_perceptron(x, weights, biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
init = tf.global_variables_initializer()
print("Starting 1st session...")
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
epoch_loss = 0
i = 0
while i < len(trX):
start = i
end = i + batch_size
batch_x = np.array(trX[start:end])
batch_y = np.array(trY[start:end])
_, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})
epoch_loss += c
i += batch_size
epoch_loss += c / len(trX[0])
if epoch % display_step == 0:
print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.3f}".format(epoch_loss))
print("Optimization Finished!")
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print("Accuracy:", accuracy.eval({x: teX, y: teY}))