I am a beginner in image segmentation. I was trying to create an Unet model with pretrained Resnet34 (imagenet) as encoder. And as for comparison, I have used the segmentation models API to get the same model. However, my model is not doing as good as the imported one, even though their structure and backbone is same.
My model:
I used the following code to import Pretrained Resnet34:
ResNet34, preprocess_input = Classifiers.get('resnet34')
Resmodel = ResNet34((256, 256, 3), weights='imagenet')
Then made a convolution block:
def ConvBlock(X,channel,kernel_size,bn=True):
x=layers.Conv2D(filters=channel,kernel_size=(kernel_size,kernel_size),strides=(1,1),dilation_rate=(1,1),padding='SAME',kernel_initializer='he_normal')(X)
if bn:
x=layers.BatchNormalization()(x)
x=layers.Activation('relu')(x)
x=layers.Conv2D(filters=channel,kernel_size=(kernel_size,kernel_size),strides=(1,1),dilation_rate=(1,1),padding='SAME',kernel_initializer='he_normal')(x)
if bn:
x=layers.BatchNormalization()(x)
x=layers.Activation('relu')(x)
return x
And finally constructed this model:
def new_model(output_channel,output_activation):
inp=Resmodel.input
skip1=Resmodel.layers[5].output #128x128x64
skip2=Resmodel.layers[37].output #64x64x64
skip3=Resmodel.layers[74].output #32x32x128
skip4=Resmodel.layers[129].output #16x16x256
encoder_final=Resmodel.layers[157].output #8x8x512
#upsample
filters=256
k=1
x=layers.UpSampling2D()(encoder_final) #returns 16x16x256
x=layers.Concatenate()([x,skip4]) #returns 16x16x512
x=ConvBlock(x,filters,kernel_size=3) #returns 16x16x256
filters //=2
x=layers.UpSampling2D()(x) #returns 32x32x128
x=layers.Concatenate()([x,skip3]) #returns 32x32x256
x=ConvBlock(x,filters,kernel_size=3) #returns 32x32x128
filters //=2
x=layers.UpSampling2D()(x) #returns 64x64x64
x=layers.Concatenate()([x,skip2]) #returns 64x64x128
x=ConvBlock(x,filters,kernel_size=3) #returns 64x64x64
filters //=2
x=layers.UpSampling2D()(x) #returns 128x128x64
x=layers.Concatenate()([x,skip1]) #returns 128x128x128
x=ConvBlock(x,filters,kernel_size=3) #returns 128x128x32
filters //=2
x=layers.UpSampling2D()(x) #returns 256x256x32
x=ConvBlock(x,filters,kernel_size=3) #returns 256x256x16
x = layers.Conv2D(output_channel, kernel_size= (1,1), strides=(1,1), padding= 'same')(x) #returns 256x256x1
x=layers.Activation('sigmoid')(x)
model=Model(inputs=inp,outputs=x)
return model
As a way to measure whether I have done it right, I used the segmentation models Pypi library to import an Unet with Resnet34 backbone.
Imported Model:
from segmentation_models import Unet
from segmentation_models.utils import set_trainable
model = Unet(backbone_name='resnet34', encoder_weights='imagenet', encoder_freeze=True)
model.summary()
But the problem is, the imported model from segmentation_models API seems to work way better (better Iou score) than the model I created. even though the structure and backbone is nearly same. So what am I doing wrong in my model? Thanks for reading such a long post.