1

I am trying to perform a multi-class semantic segmentation using tensorflow and tflearn or Keras (I tried both API). A similar problem as here (How to load Image Masks (Labels) for Image Segmentation in Keras)

I have to segment different part of an image with 3 different class: sea (class 0), boat (class 1), sky (class 2).

I have 100 grayscale images (size 400x400). For each image, I have the corresponding labels with 3 classes. In the end, I have images with shape (100, 400, 400) and labels with a shape (100,400,400,3). (As explained here: How to implement multi-class semantic segmentation?)

To be able to use semantic segmentation I used one hot encoding (like here: https://www.jeremyjordan.me/semantic-segmentation/) and I end up with this:

train_images.shape: (100,400,400,1)
train_labels.shape: (100,400,400,3)

Where the labels are as follow: sea [1,0,0]; boat [0,1,0], sky [0,0,1]

However, every time I try to train I get this error:

ValueError: Cannot feed value of shape (22, 240, 240, 3) for Tensor 'TargetsData/Y:0', which has shape '(?, 240, 240, 2)'

I load the model with this:

model = TheNet(input_shape=(None, 400, 40, 1))

EDIT: Here is the model that I use

  • With TFlearn:

    def TheNet(input_size = (80, 400, 400, 2), feature_map=8, kernel_size=5, keep_rate=0.8, lr=0.001, log_dir ="logs",savedir="Results/Session_Dump"):
    
    
    # level 0 input
    layer_0a_input  = tflearn.layers.core.input_data(input_size) #shape=[None,n1,n2,n3,1])
    
    # level 1 down
    layer_1a_conv   = tflearn_conv_2d(net=layer_0a_input, nb_filter=feature_map, kernel=5, stride=1, activation=False)
    layer_1a_stack  = tflearn_merge_2d([layer_0a_input]*feature_map, "concat")
    layer_1a_stack  = tflearn.activations.prelu(layer_1a_stack)
    layer_1a_add    = tflearn_merge_2d([layer_1a_conv,layer_1a_stack], "elemwise_sum")
    layer_1a_down   = tflearn_conv_2d(net=layer_1a_add, nb_filter=feature_map*2, kernel=2, stride=2, dropout=keep_rate)
    
    # level 2 down
    layer_2a_conv   = tflearn_conv_2d(net=layer_1a_down, nb_filter=feature_map*2, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_2a_conv   = tflearn_conv_2d(net=layer_2a_conv, nb_filter=feature_map*2, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_2a_add    = tflearn_merge_2d([layer_1a_down,layer_2a_conv], "elemwise_sum")
    layer_2a_down   = tflearn_conv_2d(net=layer_2a_add, nb_filter=feature_map*4, kernel=2, stride=2, dropout=keep_rate)
    
    # level 3 down
    layer_3a_conv   = tflearn_conv_2d(net=layer_2a_down, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_3a_conv   = tflearn_conv_2d(net=layer_3a_conv, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_3a_conv   = tflearn_conv_2d(net=layer_3a_conv, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_3a_add    = tflearn_merge_2d([layer_2a_down,layer_3a_conv], "elemwise_sum")
    layer_3a_down   = tflearn_conv_2d(net=layer_3a_add, nb_filter=feature_map*8, kernel=2, stride=2, dropout=keep_rate)
    
    # level 4 down
    layer_4a_conv   = tflearn_conv_2d(net=layer_3a_down, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_4a_conv   = tflearn_conv_2d(net=layer_4a_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_4a_conv   = tflearn_conv_2d(net=layer_4a_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_4a_add    = tflearn_merge_2d([layer_3a_down,layer_4a_conv], "elemwise_sum")
    layer_4a_down   = tflearn_conv_2d(net=layer_4a_add, nb_filter=feature_map*16,kernel=2,stride=2,dropout=keep_rate)
    
    # level 5
    layer_5a_conv   = tflearn_conv_2d(net=layer_4a_down, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_5a_conv   = tflearn_conv_2d(net=layer_5a_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_5a_conv   = tflearn_conv_2d(net=layer_5a_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_5a_add    = tflearn_merge_2d([layer_4a_down,layer_5a_conv], "elemwise_sum")
    layer_5a_up     = tflearn_deconv_2d(net=layer_5a_add, nb_filter=feature_map*8, kernel=2, stride=2, dropout=keep_rate)
    
    # level 4 up
    layer_4b_concat = tflearn_merge_2d([layer_4a_add,layer_5a_up], "concat")
    layer_4b_conv   = tflearn_conv_2d(net=layer_4b_concat, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_4b_conv   = tflearn_conv_2d(net=layer_4b_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_4b_conv   = tflearn_conv_2d(net=layer_4b_conv, nb_filter=feature_map*16, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_4b_add    = tflearn_merge_2d([layer_4b_conv,layer_4b_concat], "elemwise_sum")
    layer_4b_up     = tflearn_deconv_2d(net=layer_4b_add, nb_filter=feature_map*4, kernel=2, stride=2, dropout=keep_rate)
    
    # level 3 up
    layer_3b_concat = tflearn_merge_2d([layer_3a_add,layer_4b_up], "concat")
    layer_3b_conv   = tflearn_conv_2d(net=layer_3b_concat, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_3b_conv   = tflearn_conv_2d(net=layer_3b_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_3b_conv   = tflearn_conv_2d(net=layer_3b_conv, nb_filter=feature_map*8, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_3b_add    = tflearn_merge_2d([layer_3b_conv,layer_3b_concat], "elemwise_sum")
    layer_3b_up     = tflearn_deconv_2d(net=layer_3b_add, nb_filter=feature_map*2, kernel=2, stride=2, dropout=keep_rate)
    
    # level 2 up
    layer_2b_concat = tflearn_merge_2d([layer_2a_add,layer_3b_up], "concat")
    layer_2b_conv   = tflearn_conv_2d(net=layer_2b_concat, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_2b_conv   = tflearn_conv_2d(net=layer_2b_conv, nb_filter=feature_map*4, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_2b_add    = tflearn_merge_2d([layer_2b_conv,layer_2b_concat], "elemwise_sum")
    layer_2b_up     = tflearn_deconv_2d(net=layer_2b_add, nb_filter=feature_map, kernel=2, stride=2, dropout=keep_rate)
    
    # level 1 up
    layer_1b_concat = tflearn_merge_2d([layer_1a_add,layer_2b_up], "concat")
    layer_1b_conv   = tflearn_conv_2d(net=layer_1b_concat, nb_filter=feature_map*2, kernel=kernel_size, stride=1, dropout=keep_rate)
    layer_1b_add    = tflearn_merge_2d([layer_1b_conv,layer_1b_concat], "elemwise_sum")
    
    # level 0 classifier
    layer_0b_conv   = tflearn_conv_2d(net=layer_1b_add, nb_filter=2, kernel=5, stride=1, dropout=keep_rate)
    layer_0b_clf    = tflearn.layers.conv.conv_2d(layer_0b_conv, 2, 1, 1, activation="softmax")
    
    # Optimizer
    regress = tflearn.layers.estimator.regression(layer_0b_clf, optimizer='adam', loss=dice_loss_2d, learning_rate=lr) # categorical_crossentropy/dice_loss_3d
    
    model   = tflearn.models.dnn.DNN(regress, tensorboard_dir=log_dir)
    
    # Saving the model
    if not os.path.lexists(savedir+"weights"):
        os.makedirs(savedir+"weights")
    model.save(savedir+"weights/weights_session")
    
    return model
    
  • With Keras:

    def TheNet(input_shape, nb_kernel, kernel_size, dropout, lr, log_dir ="logs",savedir="Results/Session_Dump"):
    
    layer_0 = keras.Input(shape = input_shape)
    
    #LVL 1 Down
    layer_1_conv = Cust_2D_Conv(layer_0, nb_kernel, kernel_size, stride=1)
    layer_1_stak = keras.layers.concatenate([layer_0,layer_0,layer_0,layer_0,layer_0,layer_0,layer_0,layer_0])
    layer_1_stak = keras.layers.PReLU()(layer_1_stak)
    layer_1_addd = keras.layers.Multiply()([layer_1_conv,layer_1_stak])
    layer_1_down = Cust_2D_Conv(layer_1_addd, nb_kernel=nb_kernel*2, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 2 Down
    layer_2_conv = Cust_2D_Conv(layer_1_down, nb_kernel=nb_kernel*2, kernel_size=5, stride=1, dropout=0.2)
    layer_2_conv = Cust_2D_Conv(layer_2_conv, nb_kernel=nb_kernel*2, kernel_size=5, stride=1, dropout=0.2)
    layer_2_addd = keras.layers.Multiply()([layer_2_conv,layer_1_down])
    layer_2_down = Cust_2D_Conv(layer_2_addd, nb_kernel=nb_kernel*4, kernel_size=3, stride=2, dropout=0.2)  
    #LVL 3 Down
    layer_3_conv = Cust_2D_Conv(layer_2_down, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
    layer_3_conv = Cust_2D_Conv(layer_3_conv, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
    layer_3_conv = Cust_2D_Conv(layer_3_conv, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
    layer_3_addd = keras.layers.Multiply()([layer_3_conv,layer_2_down])
    layer_3_down = Cust_2D_Conv(layer_3_addd, nb_kernel=nb_kernel*8, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 4 Down
    layer_4_conv = Cust_2D_Conv(layer_3_down, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
    layer_4_conv = Cust_2D_Conv(layer_4_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
    layer_4_conv = Cust_2D_Conv(layer_4_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
    layer_4_addd = keras.layers.Multiply()([layer_4_conv,layer_3_down])
    layer_4_down = Cust_2D_Conv(layer_4_addd, nb_kernel=nb_kernel*16, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 5 Down
    layer_5_conv = Cust_2D_Conv(layer_4_down, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
    layer_5_conv = Cust_2D_Conv(layer_5_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
    layer_5_conv = Cust_2D_Conv(layer_5_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
    layer_5_addd = keras.layers.Multiply()([layer_5_conv,layer_4_down])
    layer_5_up = Cust_2D_DeConv(layer_5_addd, nb_kernel=nb_kernel*8, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 4 Up
    layer_4b_concat = keras.layers.concatenate([layer_5_up, layer_4_addd])
    layer_4b_conv = Cust_2D_Conv(layer_4b_concat, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
    layer_4b_conv = Cust_2D_Conv(layer_4b_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
    layer_4b_conv = Cust_2D_Conv(layer_4b_conv, nb_kernel=nb_kernel*16, kernel_size=5, stride=1, dropout=0.2)
    layer_4b_addd = keras.layers.Multiply()([layer_4b_conv,layer_4b_concat])
    layer_4b_up = Cust_2D_DeConv(layer_4b_addd, nb_kernel=nb_kernel*4, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 3 Up
    layer_3b_concat = keras.layers.concatenate([layer_4b_up, layer_3_addd])
    layer_3b_conv = Cust_2D_Conv(layer_3b_concat, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
    layer_3b_conv = Cust_2D_Conv(layer_3b_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
    layer_3b_conv = Cust_2D_Conv(layer_3b_conv, nb_kernel=nb_kernel*8, kernel_size=5, stride=1, dropout=0.2)
    layer_3b_addd = keras.layers.Multiply()([layer_3b_conv,layer_3b_concat])
    layer_3b_up = Cust_2D_DeConv(layer_3b_addd, nb_kernel=nb_kernel*2, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 2 Up
    layer_2b_concat = keras.layers.concatenate([layer_3b_up, layer_2_addd])
    layer_2b_conv = Cust_2D_Conv(layer_2b_concat, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
    layer_2b_conv = Cust_2D_Conv(layer_2b_conv, nb_kernel=nb_kernel*4, kernel_size=5, stride=1, dropout=0.2)
    layer_2b_addd = keras.layers.Multiply()([layer_2b_conv,layer_2b_concat])
    layer_2b_up = Cust_2D_DeConv(layer_2b_addd, nb_kernel=nb_kernel, kernel_size=3, stride=2, dropout=0.2)
    
    #LVL 1 Up
    layer_1b_concat = keras.layers.concatenate([layer_2b_up, layer_1_addd])
    layer_1b_conv = Cust_2D_Conv(layer_1b_concat, nb_kernel=nb_kernel*2, kernel_size=5, stride=1, dropout=0.2)
    layer_1b_addd = keras.layers.Multiply()([layer_1b_conv,layer_1b_concat])
    
    #LVL 0
    layer_0b_conv = Cust_2D_Conv(layer_1b_addd, nb_kernel=2, kernel_size=5, stride=1, dropout=0.2)
    layer_0b_clf= keras.layers.Conv2D(2, 1, 1, activation="softmax")(layer_0b_conv)
    
    model = keras.Model(inputs=layer_0, outputs=layer_0b_clf, name='Keras_model')
    
    model.compile(loss=dice_loss_2d,
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy','categorical_accuracy'])
    
    return model
    

I have been looking around to find a solution but nothing is very clear.

Does anyone have an idea or suggestion?

Unic0
  • 341
  • 1
  • 3
  • 19
  • Why are your `train_labels.shape: (100,400,400,3)`, shouldn't it be `(100, 3)`? – Ahmad Baracat Dec 25 '19 at 23:09
  • Please provide your model. Merry Christmas to you too! – Geeocode Dec 26 '19 at 00:01
  • @AhmadBaracat, labels are images as well, I want to perform pixel wise segmentation, so I have 100 images with width 400, a height 400 and 3 channels (one for each thing I want to label). – Unic0 Dec 26 '19 at 00:52

1 Answers1

1

for whom might face the same problem, I found the solution

The issue is not in the input shape per-say. The input shape have to be (100, 400, 400, 1) and (100, 400, 400, 3) for input images and labels, respectively.

However, the problem lies in the model and the output shape of the model that have to match the input of the model. In the code shown in the original post the output shape results directly from this line:

layer_0b_clf    = tflearn.layers.conv.conv_2d(layer_0b_conv, 2, 1, 1, activation="softmax")

which yields an output shape (?,400,400,2) and therefore doesn't match the label shape for evaluation (which is (100, 400, 400, 3). The solution is the change in the number of output channels from the models such as:

- for TFlearn: conv_2d(layer_0b_conv, 3, 1, 1, activation="softmax")

    layer_0b_clf    = tflearn.layers.conv.conv_2d(layer_0b_conv, 3, 1, 1, activation="softmax")

- for Keras: Conv2D(3, 1, 1, activation="softmax")

    layer_0b_clf= keras.layers.Conv2D(3, 1, 1, activation="softmax")(layer_0b_conv)

Hopefully, it will help someone.

Thank you for your comments and readings.

Unic0
  • 341
  • 1
  • 3
  • 19