How can I fine-tune EfficientNetB3 model and retain some of its exisiting labels?

Question

I've tested EfficientNetB3 model (trained on ImageNet) on my large image set and it recognizes some classes of images that I have with varying accuracy, the others are not recognized at all.

For example, it does a great job for school buses: ('n04146614', 'school_bus') and a decent job for ('n04487081', 'trolleybus'), ('n02701002', 'ambulance'), ('n03977966', 'police_van').

So I would like to keep these labels and feed more images to the model to improve their detection rate. At the same time, while it detects police vans, it completely misses other police vehicles, so I would have to create new labels for them.

How should I approach? Is this possible in one training session?

Would you share some insight of your dataset like image characteristics, number of image in each class, and etc? If possible, confusion matrix would be helpful for others to share their opinion. — Cloud Cho, Nov 29 '22 at 02:08

score 0 · Answer 1 · edited Dec 04 '22 at 13:02

Using a model trained on imagnet will do a reasonably good job of identify images if they were included in the original imagenet dataset. If they were not present as a class the model will perform very poorly. What you normally do is customize the model for the unique classes in your dataset. This process is called transfer learning. First you have to decide what classes you want to have and gather the appropriate images associated with each class. For examples lets say you have the classes police car, school bus, fire trucks, garbage truck and delivery van. So you need to gather the appropriate images for each class. Typically you need about 120 to 150 images for each class as a minimum. So we now have 5 classes. Create a single directory call is sdir. Below sdir create 5 subdirectories one for each class. Name these as police car, school bus, etc. Now put the images into their respective subdirectories. Now the function below can be used to split the dataset into three datasets called train_df, test_df and valid_df.

def preprocess (sdir, trsplit, vsplit):
    filepaths=[]
    labels=[]    
    classlist=os.listdir(sdir)
    for klass in classlist:
        classpath=os.path.join(sdir,klass)
        if os.path.isdir(classpath):
            flist=os.listdir(classpath)
            for f in flist:
                fpath=os.path.join(classpath,f)
                filepaths.append(fpath)
                labels.append(klass)
    Fseries=pd.Series(filepaths, name='filepaths')
    Lseries=pd.Series(labels, name='labels')
    df=pd.concat([Fseries, Lseries], axis=1)            
    dsplit=vsplit/(1-trsplit)
    strat=df['labels']
    train_df, dummy_df=train_test_split(df, train_size=trsplit, shuffle=True, random_state=123, stratify=strat)
    strat=dummy_df['labels']
    valid_df, test_df= train_test_split(dummy_df, train_size=dsplit, shuffle=True, random_state=123, stratify=strat)
    print('train_df length: ', len(train_df), '  test_df length: ',len(test_df), '  valid_df length: ', len(valid_df))
    print(list(train_df['labels'].value_counts()))
    return train_df, test_df, valid_df

Now call the function

sdir=r'C:\sdir'
trsplit=.8 # percent of images to use for training
vsplit=.1 # percent of images to use for validation
train_df, test_df, valid_df= preprocess(sdir,trsplit, vsplit)

Now you need to create 3 generators using ImageDataGenerator.flow_from_dataframe. Documentation is here.

channels=3
batch_size=20 # set batch size based on model complexity and sie of images
img_shape=(img_size[0], img_size[1], channels)
# calculate test_batch_size and test_steps so that test_batch_size X test_steps = number of test images
# this ensures you go through the test set exactly once when doing predictions on the test set
length=len(test_df)
test_batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=80],reverse=True)[0]  
test_steps=int(length/test_batch_size)
print ( 'test batch size: ' ,test_batch_size, '  test steps: ', test_steps)
trgen=ImageDataGenerator(horizontal_flip=True)
tvgen=ImageDataGenerator()
msg='                                                              for the train generator'
print(msg, '\r', end='') 
train_gen=trgen.flow_from_dataframe( train_df, x_col='filepaths', y_col='labels', target_size=img_size, class_mode='categorical',
                                    color_mode='rgb', shuffle=True, batch_size=batch_size)
msg='                                                              for the test generator'
print(msg, '\r', end='') 
test_gen=tvgen.flow_from_dataframe( test_df, x_col='filepaths', y_col='labels', target_size=img_size, class_mode='categorical',
                                    color_mode='rgb', shuffle=False, batch_size=test_batch_size)
msg='                                                             for the validation generator'
print(msg, '\r', end='')
valid_gen=tvgen.flow_from_dataframe( valid_df, x_col='filepaths', y_col='labels', target_size=img_size, class_mode='categorical',
                                    color_mode='rgb', shuffle=True, batch_size=batch_size)
classes=list(train_gen.class_indices.keys())
class_count=len(classes)
train_steps=int(np.ceil(len(train_gen.labels)/batch_size))
labels=test_gen.labels

Now create your model. A suggested model is shown below using EfficientNetB3

def make_model(img_img_size, class_count,lr=.001, trainable=True):
    img_shape=(img_size[0], img_size[1], 3)
    model_name='EfficientNetB3'
    base_model=tf.keras.applications.efficientnet.EfficientNetB3(include_top=False, weights="imagenet",input_shape=img_shape, pooling='max') 
    base_model.trainable=trainable
    x=base_model.output
    x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
    x = Dense(256, kernel_regularizer = regularizers.l2(l = 0.016),activity_regularizer=regularizers.l1(0.006),
                    bias_regularizer=regularizers.l1(0.006) ,activation='relu')(x)
    x=Dropout(rate=.45, seed=123)(x)        
    output=Dense(class_count, activation='softmax')(x)
    model=Model(inputs=base_model.input, outputs=output)
    model.compile(Adamax(learning_rate=lr), loss='categorical_crossentropy', metrics=['accuracy']) 
    return model, base_model # return the base_model so the callback can control its training state

Now call the function

model, base_model=make_model(img_size, class_count)

Now you can train your model

history=model.fit(x=train_gen,  epochs=epochs, verbose=0, validation_data=valid_gen,
               validation_steps=None,  shuffle=False,  initial_epoch=0)

After training you can evaluate your models performance on the test set

loss, acc=model.evaluate(test_gen, steps=test_steps)

Thanks for your input, I don't know if you have noticed but I wrote that I would like to keep labels that already exist in the model. What you are proposing is to ditch all the labels that model already has and re-create the ones that I need. Obviously there is a plenty of examples in the network how to do this, yet none is addressing my case, hence the question. — Wodzu, Mar 30 '22 at 10:10

How can I fine-tune EfficientNetB3 model and retain some of its exisiting labels?

1 Answers1