Constructing Unet with pretrained Resnet34 encoder with Keras

Question

I am a beginner in image segmentation. I was trying to create an Unet model with pretrained Resnet34 (imagenet) as encoder. And as for comparison, I have used the segmentation models API to get the same model. However, my model is not doing as good as the imported one, even though their structure and backbone is same.

My model:

I used the following code to import Pretrained Resnet34:

ResNet34, preprocess_input = Classifiers.get('resnet34')
Resmodel = ResNet34((256, 256, 3), weights='imagenet')

Then made a convolution block:

def ConvBlock(X,channel,kernel_size,bn=True):
  x=layers.Conv2D(filters=channel,kernel_size=(kernel_size,kernel_size),strides=(1,1),dilation_rate=(1,1),padding='SAME',kernel_initializer='he_normal')(X)
  if bn:
    x=layers.BatchNormalization()(x)
  x=layers.Activation('relu')(x)

  x=layers.Conv2D(filters=channel,kernel_size=(kernel_size,kernel_size),strides=(1,1),dilation_rate=(1,1),padding='SAME',kernel_initializer='he_normal')(x)
  if bn:
    x=layers.BatchNormalization()(x)
  x=layers.Activation('relu')(x)
  return x

And finally constructed this model:

def new_model(output_channel,output_activation):
  inp=Resmodel.input

  skip1=Resmodel.layers[5].output #128x128x64
  skip2=Resmodel.layers[37].output #64x64x64
  skip3=Resmodel.layers[74].output #32x32x128
  skip4=Resmodel.layers[129].output #16x16x256
  encoder_final=Resmodel.layers[157].output #8x8x512

  #upsample 
  filters=256
  k=1

  x=layers.UpSampling2D()(encoder_final) #returns 16x16x256
  x=layers.Concatenate()([x,skip4]) #returns 16x16x512
  x=ConvBlock(x,filters,kernel_size=3) #returns 16x16x256
  filters //=2

  x=layers.UpSampling2D()(x) #returns 32x32x128
  x=layers.Concatenate()([x,skip3]) #returns 32x32x256
  x=ConvBlock(x,filters,kernel_size=3) #returns 32x32x128
  filters //=2

  x=layers.UpSampling2D()(x) #returns 64x64x64
  x=layers.Concatenate()([x,skip2]) #returns 64x64x128
  x=ConvBlock(x,filters,kernel_size=3) #returns 64x64x64
  filters //=2 

  x=layers.UpSampling2D()(x) #returns 128x128x64
  x=layers.Concatenate()([x,skip1]) #returns 128x128x128
  x=ConvBlock(x,filters,kernel_size=3) #returns 128x128x32
  filters //=2

  x=layers.UpSampling2D()(x) #returns 256x256x32
  x=ConvBlock(x,filters,kernel_size=3) #returns 256x256x16
  x = layers.Conv2D(output_channel, kernel_size= (1,1), strides=(1,1), padding= 'same')(x)  #returns 256x256x1
  x=layers.Activation('sigmoid')(x)
  model=Model(inputs=inp,outputs=x)
  return model

As a way to measure whether I have done it right, I used the segmentation models Pypi library to import an Unet with Resnet34 backbone.

Imported Model:

from segmentation_models import Unet
from segmentation_models.utils import set_trainable

model = Unet(backbone_name='resnet34', encoder_weights='imagenet', encoder_freeze=True)
model.summary()

But the problem is, the imported model from segmentation_models API seems to work way better (better Iou score) than the model I created. even though the structure and backbone is nearly same. So what am I doing wrong in my model? Thanks for reading such a long post.

It's impossible to tell anything without knowing the dataset you've used to train and evaluate your model. Maybe your training was just not good enough. — Zabir Al Nazi, May 05 '20 at 06:59

score 2 · Accepted Answer · answered May 07 '20 at 10:03

2

The problem was in the code I used to freeze the ResNet34 layers. I was mistakenly freezing the BatchNorm layers too, which was causing the gap in the performance. Upon selectively freezing all the Resnet34 layers except BatchNorm layers, the model performed up to the mark.

answered May 07 '20 at 10:03

user9429950

111
1
5

So my comment was partly correct, since I said that you have to ensure that you need to have the same number of trainable layers... – Timbus Calin May 07 '20 at 10:04

score 0 · Answer 2 · answered May 05 '20 at 06:50

Have you checked the implementation of UNet in that specific library?

As far as I remember, the UpSampling() layers were replaced with Conv2DTranspose(), hence a possible cause of the difference.

In addition to this, ensure that you have the exact same of trainable layers just like in segmentation_models

score 0 · Answer 3 · answered May 05 '20 at 14:47

Please check whether your model's weights have been loaded successfully. You can check this by running a same test input on the encoder part of the models. It seems that the weights have not been successfully loaded and therefore giving poor performance.

Also to note, quoting from the above answer by Timbus Calin, since decoder_block_type='upsampling' by default. So thats not an issue. Neverthless please verify both the model summaries to ensure there is no differences, especially number of filter, and trainable layers.

Constructing Unet with pretrained Resnet34 encoder with Keras

3 Answers3