I use Segmentation Models library for multi-class (in my case 4 class) semantic segmentation. The model (UNet with 'resnet34' backbone) is trained with 3000 RGB (224x224x3) images. The accuracy is around 92.80%.
1) Why model.predict() function requires (1,224,224,3) shaped array as input ? I didn't find the answer even in the Keras documentation. Actually, below code is working, I have no problem with it but I want to understand the reason.
predictions = model.predict( test_image.reshape(-1,224,224,3) );
2) predictions
is a (1,224,224,3) shaped numpy array. Its data type is float32
and contains some floating numbers. What is the meaning of the numbers inside this array? How can I visualize them? I mean, I assumed that the result array will contain one of 4 class label (from 0 to 3) for every pixel, and then I will apply the color map for each class. In other words, the result should have been a prediction map, but I didn't get it. To understand better what I mean about prediction map, please visit the Jeremy Jordan's blog about semantic segmentation.
result = predictions[0]
plt.imshow(result) # import matplotlib.pyplot as plt
3) What I finally want to do is like Github: mrgloom - Semantic Segmentation Categorical Crossentropy Example did in visualy_inspect_result
function.