0

I'm getting an error when trying to restore a trained model for evaluation, but only when evaluating on the test set. The error is:

InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]

Note that lhs shape = [2325,11] and rhs shape = [4891,11] corresponds to 2325 images in the test set and 4891 images in the training set; and the 11 is one-hot encoding for 11 classes - so these likely correspond to the labels. When I run evaluation on the training set the dimensions match and no error results. Help would be appreciated!

Full stack trace below:

Traceback (most recent call last):
  File "eval.py", line 75, in <module>
    main()
  File "eval.py", line 70, in main
    acc_annotation, acc_retrieval = evaluate(partition="test")
  File "eval.py", line 34, in evaluate
    restorer.restore(sess, tf.train.latest_checkpoint(SAVED_MODEL_DIR))
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1388, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
     [[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]

Caused by op u'save/Assign_5', defined at:
  File "eval.py", line 75, in <module>
    main()
  File "eval.py", line 70, in main
    acc_annotation, acc_retrieval = evaluate(partition="test")
  File "eval.py", line 25, in evaluate
    restorer = tf.train.Saver()  # For saving the model
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1000, in __init__
    self.build()
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1030, in build
    restore_sequentially=self._restore_sequentially)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 624, in build
    restore_sequentially, reshape)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 373, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 130, in restore
    self.op.get_shape().is_fully_defined())
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
    use_locking=use_locking, name=name)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
     [[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]

Update

I just looked into the tensor shapes from the checkpoint file and it looks like the saver was saving even the inputs to the model. I'll need to reconfigure my training code or otherwise figure out how to exclude the model inputs (labels and images) from the checkpointing:

('tensor_name: ', 'conv2-layer/bias/Adam_1')
(512,)
('tensor_name: ', 'input/Variable_1')
(4891, 11)
('tensor_name: ', 'conv2-layer/weights_1/Adam')
(5, 1, 64, 512)
('tensor_name: ', 'conv1-layer/weights_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv2-layer/weights_1')
(5, 1, 64, 512)
('tensor_name: ', 'conv2-layer/weights_1/Adam_1')
(5, 1, 64, 512)
('tensor_name: ', 'input/Variable')
(4891, 100, 23, 1)
('tensor_name: ', 'conv1-layer/weights_1/Adam_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv1-layer/bias/Adam')
(64,)
('tensor_name: ', 'beta2_power')
()
('tensor_name: ', 'conv2-layer/bias/Adam')
(512,)
('tensor_name: ', 'conv1-layer/bias/Adam_1')
(64,)
('tensor_name: ', 'conv2-layer/bias')
(512,)
('tensor_name: ', 'conv1-layer/bias')
(64,)
('tensor_name: ', 'beta1_power')
()
('tensor_name: ', 'conv1-layer/weights_1/Adam')
(5, 23, 1, 64)
('tensor_name: ', 'Variable')
()
kashkar
  • 663
  • 1
  • 8
  • 22

1 Answers1

1

It would be useful to see the code defining your model; but from what I can understand it looks like you may have defined you input as a tf.Variable. A variable is a value that the optimiser is allowed to change in order to minimise the loss function. Variables are the learning weights of your model which is why Tensorflow is saving them so they can be restored later.

You should feed your input data to the graph using a tf.Placeholder.

Florentin Hennecker
  • 1,974
  • 23
  • 37
  • Yes that was exactly it. I fixed it as you said by following the procedure from http://stackoverflow.com/questions/37091899/how-to-actually-read-csv-data-in-tensorflow – kashkar Feb 03 '17 at 04:06