0

i am running a model for my own data set(the project was implemented for training/testing with ImageNet) with 2 classes. I have made all the changes (in config files etc) but after training finishes(successfully), i get the following error when starting testing:

wrote gt roidb to ./data/cache/ImageNetVID_DET_val_gt_roidb.pkl
Traceback (most recent call last):
  File "experiments/dff_rfcn/dff_rfcn_end2end_train_test.py", line 20, in <module>
    test.main()
  File "experiments/dff_rfcn/../../dff_rfcn/test.py", line 53, in main
    args.vis, args.ignore_cache, args.shuffle, config.TEST.HAS_RPN, config.dataset.proposal, args.thresh, logger=logger, output_path=final_output_path)
  File "experiments/dff_rfcn/../../dff_rfcn/function/test_rcnn.py", line 68, in test_rcnn
    roidbs_seg_lens[gpu_id] += x['frame_seg_len']
KeyError: 'frame_seg_len'

I cleaned the cache file before running. As i have read in previous topics, this might be an issue of previous datasets .pkl files in cache. What may have caused this error? I also want to mention that i changed .txt filenames that feed the neural network(if this is important), and that training finishes well. This is my first time running a project in Deep Learning so please show some understanding.

PAPAS
  • 33
  • 6
  • Is this testing using a saved model? – Sina Afrooze Feb 01 '18 at 23:09
  • Looks like this question is specific to the [FGFA](https://github.com/msracver/Flow-Guided-Feature-Aggregation) implementation. Filing an [issue](https://github.com/msracver/Flow-Guided-Feature-Aggregation/issues) in that repo might be the best way to get an answer. – Indhu Bharathi Mar 12 '18 at 23:30

1 Answers1

0

MXNet typically uses methods other than pickle directly for serialization of the model architecture and trained weights.

With Gluon API, you can save weights of the model to a file (i.e. Block) with .save_params() and then load the weights from a file with .load_params(). You 'save' the model architecture by keeping the code used to define the model. See and example of this here.

With Module API, you can create checkpoints at the end of each epoch which will save the symbol (i.e. model architecture) and the parameters (i.e. model weights). See here.

checkpoint = mx.callback.do_checkpoint(model_prefix)
mod = mx.mod.Module(symbol=net)
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)

You can then load the model of a given checkpoint (e.g. 42 in this example)

sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 42)
mod.set_params(arg_params, aux_params)
Thom Lane
  • 993
  • 9
  • 9