I'm getting an error when trying to restore a trained model for evaluation, but only when evaluating on the test set. The error is:
InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
Note that lhs shape = [2325,11] and rhs shape = [4891,11] corresponds to 2325 images in the test set and 4891 images in the training set; and the 11 is one-hot encoding for 11 classes - so these likely correspond to the labels. When I run evaluation on the training set the dimensions match and no error results. Help would be appreciated!
Full stack trace below:
Traceback (most recent call last):
File "eval.py", line 75, in <module>
main()
File "eval.py", line 70, in main
acc_annotation, acc_retrieval = evaluate(partition="test")
File "eval.py", line 34, in evaluate
restorer.restore(sess, tf.train.latest_checkpoint(SAVED_MODEL_DIR))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1388, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 964, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
[[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]
Caused by op u'save/Assign_5', defined at:
File "eval.py", line 75, in <module>
main()
File "eval.py", line 70, in main
acc_annotation, acc_retrieval = evaluate(partition="test")
File "eval.py", line 25, in evaluate
restorer = tf.train.Saver() # For saving the model
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1000, in __init__
self.build()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1030, in build
restore_sequentially=self._restore_sequentially)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 624, in build
restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 373, in _AddRestoreOps
assign_ops.append(saveable.restore(tensors, shapes))
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 130, in restore
self.op.get_shape().is_fully_defined())
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 47, in assign
use_locking=use_locking, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2325,11] rhs shape= [4891,11]
[[Node: save/Assign_5 = Assign[T=DT_FLOAT, _class=["loc:@input/Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/cpu:0"](input/Variable_1, save/RestoreV2_5)]]
Update
I just looked into the tensor shapes from the checkpoint file and it looks like the saver was saving even the inputs to the model. I'll need to reconfigure my training code or otherwise figure out how to exclude the model inputs (labels and images) from the checkpointing:
('tensor_name: ', 'conv2-layer/bias/Adam_1')
(512,)
('tensor_name: ', 'input/Variable_1')
(4891, 11)
('tensor_name: ', 'conv2-layer/weights_1/Adam')
(5, 1, 64, 512)
('tensor_name: ', 'conv1-layer/weights_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv2-layer/weights_1')
(5, 1, 64, 512)
('tensor_name: ', 'conv2-layer/weights_1/Adam_1')
(5, 1, 64, 512)
('tensor_name: ', 'input/Variable')
(4891, 100, 23, 1)
('tensor_name: ', 'conv1-layer/weights_1/Adam_1')
(5, 23, 1, 64)
('tensor_name: ', 'conv1-layer/bias/Adam')
(64,)
('tensor_name: ', 'beta2_power')
()
('tensor_name: ', 'conv2-layer/bias/Adam')
(512,)
('tensor_name: ', 'conv1-layer/bias/Adam_1')
(64,)
('tensor_name: ', 'conv2-layer/bias')
(512,)
('tensor_name: ', 'conv1-layer/bias')
(64,)
('tensor_name: ', 'beta1_power')
()
('tensor_name: ', 'conv1-layer/weights_1/Adam')
(5, 23, 1, 64)
('tensor_name: ', 'Variable')
()