0

the code to obtain the learning rate in resnet50_train.py is as follows:

learning_rate = workspace.FetchBlob(prefix + '/conv1_w_lr')

and when I run the code, errors occured:

Traceback (most recent call last): File "/home/caffe2/caffe2/caffe2/python/examples/resnet50_trainer.py", line 475, in main() File "/home/caffe2/caffe2/caffe2/python/examples/resnet50_trainer.py", line 471, in main Train(args) File "/home/caffe2/caffe2/caffe2/python/examples/resnet50_trainer.py", line 400, in Train explog File "/home/caffe2/caffe2/caffe2/python/examples/resnet50_trainer.py", line 163, in RunEpoch learning_rate = workspace.FetchBlob(prefix + '/conv1_w_lr') File "/home/caffe2-master/caffe2/build/caffe2/python/workspace.py", line 323, in FetchBlob return C.fetch_blob(StringifyBlobName(name)) RuntimeError: [enforce fail at pybind_state.cc:152] ws->HasBlob(name). Can't find blob: gpu_0/conv1_w_lr

what caused the problem? Should I re-compile any dependencies or if there is any other function can be employed to obtain learning rate?

SurvivalMachine
  • 7,946
  • 15
  • 57
  • 87
sq g
  • 1

1 Answers1

0

Before you run FetchBlob(prefix + '/conv1_w_lr') you can check what blobs exist in the workspace: for b in workspace.Blobs(): print(b).

Perhaps conv1_w_lr exists without the prefix. Perhaps, you first need to RunNet to have all the blobs in the workspace.

  • thanks a lot. . I reviewed resnet50_trainer.py on github and found it different from the version I downloaded. the way of getting learning_rate has changed... learning_rate = workspace.FetchBlob( data_parallel_model.GetLearningRateBlobNames(train_model)[0] ) latest caffe2 was recompiled and now it's ok. – sq g Jul 27 '17 at 06:32