I started to learn ML using Tensorflow/Deeplab. I tried to train my own model from scratch for clothes recognition using semantic segmentation with mobilenet_v2 model variant. But I don't get results.
I'm using tensorflow/models for tfrecord export and training. And deeplab/example code for visualization and testing purpose (renamed locally as main.py), I modify some lines so I can get the local models and testing image.
I'll show the process I followed:
- Download 100 JPEG images (I know is not that big, but I guess I can try it with this amount). Just for 1 class -> shirts
- Create the segmentation class PNG for each image.
- Create the files image sets definition for: train(85 filenames), trainval(100 filenames) and val(15 filenames).
So my "pascal dataset" directory has: ImageSets, JPEGImages and SegmentationClassPNG folders. Export the "pascal dataset" directory to tfrecord like this (I'm on "models-master/research/deeplab/datasets" folder):
py build_voc2012_data.py --image_folder="pasc_imgs/JPEGImages" --semantic_segmentation_folder="pasc_imgs/SegmentationClassPNG" --list_folder="pasc_imgs/ImageSets" --image_format="jpg" --output_dir="train/tfrecord"
- this works fine, it generates *.tfrecord files on "train/tfrecord"
I edited "models-master/research/deeplab/data_generator.py" like this: {'train': 85, 'trainval': 100, 'val': 15}, num_classes=2.
- Now time to start the training, (I'm on "models-master/research/deeplab"). I used 10000 steps, why? I proved with 30000 and takes like 30 hours with no results, so I reduce it with new params. I guess 10000 steps could show me something:
py train.py --logtostderr --training_number_of_steps=10000 --train_split="train" --model_variant="mobilenet_v2" --output_stride=16 --decoder_output_stride=4 --train_batch_size=1 --dataset="pascal_voc_seg" --train_logdir="datasets/train/deeplab_model_mn" --dataset_dir="datasets/train/tfrecord"
- This step takes almost 8 hours (have a tiny GPU, so.. can't use it), and it generates the checkpoint, graph.pbtxt, and model.ckpt-XXX (10000 included) files.
- I exported the previous result with (I'm on "models-master/research/deeplab") this command line:
py export_model.py --checkpoint_path=datasets/train/deeplab_model_mn/model.ckpt-10000 --export_path=datasets/train/deeplab_inference_mn/frozen_inference_graph.pb --model_variant="mobilenet_v2" --output_stride=16 --num_classes=2
- It creates the frozen graph (frozen_inference_graph.pb).
- Now run: py main.py (proof image and frozen_inference_graph.pb already imported)
- No results with my custom model. This last script works with pre-trained mobilenetv2_coco_voc_trainaug. Not with my custom model
data_generator.py (edited lines):
_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 85,
'trainval': 100,
'val': 15,
},
num_classes=2,# 0:background, 1:shirt
ignore_label=255,
)
Image example (1/100) that I'm using for training (I used the labelMe utility):
shirt_001.jpg
shirt_001.png
main.py result for mobilenetv2_coco_voc_trainaug (shirt as a person, that's ok) and my custom model :
mobilenetv2_coco_voc_trainaug result
my custom model result
As you can see, my model fails. I've been testing many combinations without success. What should I do? Thank you!