1

I am training a simple network. Having trouble to get caffe run, I decided to test run on 20 images only. But I can't get past this following error message. I have rebuilt caffe as suggested by other posts but didn't solve the issue.

I1008 13:52:01.227901 45606 solver.cpp:454] Snapshotting to binary proto     file _iter_10.caffemodel
*** Aborted at 1475952725 (unix time) try "date -d @1475952725" if you    are using GNU date ***
PC: @     0x7f5e0130768c   caffe::BlobProto::SerializeWithCachedSizesToArray()
*** SIGSEGV (@0xd70e000) received by PID 45606 (TID 0x7f5e01e0ea00) from     PID 225501184; stack trace: ***
@     0x7f5df32c98d0 (unknown)
@     0x7f5e0130768c caffe::BlobProto::SerializeWithCachedSizesToArray()
@     0x7f5e0130d13f caffe::LayerParameter::SerializeWithCachedSizesToArray()
@     0x7f5e0130f8d7 caffe::NetParameter::SerializeWithCachedSizesToArray()
@     0x7f5dfb6fd58a (unknown)
@     0x7f5dfb6fd655 (unknown)
@     0x7f5dfb6fd7bf (unknown)
@     0x7f5dfb76815b (unknown)
@     0x7f5e01389803 caffe::WriteProtoToBinaryFile()
@     0x7f5e013a1a82 caffe::Solver<>::SnapshotToBinaryProto()
@     0x7f5e013a1b6f caffe::Solver<>::Snapshot()
@     0x7f5e013a3219 caffe::Solver<>::Step()
@     0x7f5e013a34a9 caffe::Solver<>::Solve()
@           0x409426 train()
@           0x405c83 main
@     0x7f5df2f30b45 (unknown)
@           0x406565 (unknown)
@                0x0 (unknown)
*** Error in `caffe': malloc(): memory corruption: 0x000000000d4ceac0 ***

I have a feeling that it is caused by my solver file. Here is the my solver.

net: "/X/train.prototxt"
test_iter: 5
test_interval: 5
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "step"
stepsize: 5
gamma: 0.1
power: 0.75
display: 5
max_iter: 20
snapshot: 10
snapshot_prefix: "/X/A"
solver_mode: GPU

Do you see any issue on my solver?

Cheers,

Shai
  • 111,146
  • 38
  • 238
  • 371
user2413711
  • 85
  • 2
  • 12

1 Answers1

1

Is your model larger than 2gb? If so, this error may be due to the limitation of the protobuf format. Try adding

snapshot_format: HDF5

at the end of your solver.prototxt to save in hdf5 format instead.

Related discussion can be found at: https://github.com/BVLC/caffe/pull/2836

magicharp
  • 11
  • 2