Why CNN model after regularizer L2 overfitting?

Question

x_train1, x_test, y_train1, y_test = train_test_split(images, labels,test_size=0.2,random_state=42)
x_train2, x_val,y_train2,y_val = train_test_split(x_train1, y_train1,test_size=0.05,random_state=42)

Layers

model = Sequential()
model.add(Conv2D(32, (3, 3), activation = 'relu', input_shape=(128,128,1), kernel_regularizer=keras.regularizers.l2(0.005), padding ='same', name='Conv_1'))
model.add(MaxPooling2D((2,2),name='MaxPool_1'))
model.add(Conv2D(64, (3, 3), activation = 'relu',padding ='same', kernel_regularizer=keras.regularizers.l2(0.005), name='Conv_2'))
model.add(MaxPooling2D((2,2),name='MaxPool_2'))
model.add(Flatten(name='Flatten'))
model.add(Dropout(0.5,name='Dropout'))
model.add(Dense(64, kernel_initializer='normal', activation='relu', name='Dense_1'))
model.add(Dense(1, kernel_initializer='normal', activation='sigmoid', name='Dense_2'))
model.summary()

Model compile

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(x_train2, y_train2,validation_data=(x_test, y_test),batch_size=32, epochs=100 )

** Results ** Train: accuracy = 0.939577 ; loss = 0.134506 Test: accuracy = 0.767908 ; loss = 0.8002433

lejlot · Answer 1 · 2022-02-20T12:11:31.743

0

Regularization is not a magical option that will just close the gap between train and test at any "weight". One way of thinking about this is that when you take the strength of regularistaion, so a cofficiant alpha (in your case =0.005) and then express the gap between train and test as a function of it, say f(x) (in your case f(0.005) = 0.94-0.76 = 0.18), then the only thing we know is that f(inf) = 0. In other words, as you increase regularization strength, the gap disappears (at the cost of trainin score going down). There is no one magical form of regularistaion, and there is no guarantee L2 is good for your problem. You can make the gap disappear by just making the weight higher, but it might lead to bot trian and test going very low.

edited Feb 20 '22 at 12:11

answered Feb 20 '22 at 12:04

lejlot

64,777
8
131
164

Thank you so much for responding to me. As you said that training accuracy will go down but then what's the solution to rise up both accuracies? – WAJEEHA KHALIL Feb 21 '22 at 17:29
There is no "one solution". You are asking a research question, and there is whole literature on the topic. The most basic solutions are: get way more data, use data augmentation, increase model size in parallel, and use well known existing arhitectures (e.g. start with ResNet), don't try to "invent" anything, or change sizes of any layers. – lejlot Feb 21 '22 at 19:39

Why CNN model after regularizer L2 overfitting?

1 Answers1