0

I used the code below for simple logistic regression. I was able to get the updated value of b: the values of b.eval() before/after training are different. However, the value of W.eval() remains the same. I was wondering what mistake I made? Thank you!

from __future__ import print_function

import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 20
batch_size = 100
display_step = 1

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.random_normal([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    print('W is:')
    print(W.eval())
    print('b is:')
    print(b.eval())
    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                 y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print("Optimization Finished!")

    print('W is:')
    print(W.eval())
    print('b is:')
    print(b.eval())
    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print("Accuracy:", accuracy.eval({x: mnist.test.images, y:     mnist.test.labels}))
E_net4
  • 27,810
  • 13
  • 101
  • 139
vki
  • 3
  • 1
  • See [this](https://stackoverflow.com/a/35962343/2861681) – vmg Aug 14 '17 at 13:40
  • I didn't initialize all zero. I used random normal initialization. In addition, the model has high predictive performance after training, so W can not be zero matrix. – vki Aug 14 '17 at 13:55

1 Answers1

0

When we print a numpy array only initial and last values will get printed, And in case of MNIST those indices of weights are not updating as corresponding pixels in images remains constant as all digits are written in centre part of array or image not along boundary regions. The actual pixels which are varying from one input sample to another input sample are centre pixels so only those corresponding weights elements will get update. To compare weights before and after training you can use numpy.array_equal(w1, w2) or, you can print whole numpy array by doing: import numpy numpy.set_printoptions(threshold='nan') or, you can compare element by element, and print only those values of array which differ by a certain threshold

Gaurav Pawar
  • 449
  • 2
  • 7