0

I'm having some trouble understanding how MXNet ImageRecordIter works. Here is the reference I've been using

For one, what does the --test-ratio flag actually do? When generating an lst file, I can't tell which lines are test data.

Another larger issue I'm having is the format of labels. If we have N classes, a standard neural net output might be a softmax'd vector with N dimensions. A normal label in this case would be a 1 hot encoding with a 1 in the dimension which maps to our class. But ImageRecordIter seems like it's label format is just a single number? Is there some behind the scene magic going on?

Priyantha
  • 4,839
  • 6
  • 26
  • 46
Joshua
  • 1
  • 2

1 Answers1

0

Lets first start with the --train-ratio and --test-ratio. Both of the keys are used just to split all images onto test and train groups. Here is precise place in the code that process these flags. Let me copy paste logic from there:

    if args.train_ratio == 1.0:
        write_list(args.prefix + str_chunk + '.lst', chunk)
    else:
        if args.test_ratio:
            write_list(args.prefix + str_chunk + '_test.lst', chunk[:sep_test])
        if args.train_ratio + args.test_ratio < 1.0:
            write_list(args.prefix + str_chunk + '_val.lst', chunk[sep_test + sep:])
        write_list(args.prefix + str_chunk + '_train.lst', chunk[sep_test:sep_test + sep])

As can be seen if --train-ratio is set to 1.0 it completely ignores any test ratio and just dumps all images in the file(in our case caltech.lst). This is main source of confusion, because here is how the default value for the --train-ratio is populated:

cgroup.add_argument('--train-ratio', type=float, default=1.0,
                    help='Ratio of images to use for training.')

by default it is set to 1.0. There fore it is irrelevant what is set to --test-ratio if --train-ratio is not set. Keeping this in mind lets have a look on the command from the article:

os.system('python %s/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 data/caltech data/101_ObjectCategories'%os.environ['MXNET_HOME'])

the command here includes only --test-ratio, therefore it will NOT produce 2 files (./data/caltech_train.lst and ./data/caltech_test.lst) as the article claims but rather it produces one file (./data/caltech.lst) due to the reason explained above.

In order to fix this here is the correct command that need to be executed:

os.system('python %s/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 --train-ratio=0.8 data/caltech data/101_ObjectCategories'%os.environ['MXNET_HOME'])

At this point I hope it is clear what is the source of the confusion and how required keys are working.

Now, as per the second part of the question. im2rec.py is a helper script to prepare the data. It is agnostic from the way how you actually planning to use the data. Therefore it stores label as number (BTW there might be more then 1 label per image). And it is up to the consumer of such list to convert label number to anything that he/she want to use for training. You can use it with SoftMax by creating a vector of size equals to the amount of labels and setting 1 to the cell with the index equal to the label number.

PS: if the reader have time I would encourage to submit pull-request with the fixed command to the repository, with the article.