Convolution Neural Network model architecture

Question

I'm working on Facial expressions recognition with Keras.

I've a dataset with 72 000 images. I'm using 80% for Train, 10% for Validation and 10 % for Test.

All the images are 48 x 48 on grayscale mode.

My model architecture is like this :

model = Sequential()
model.add(Conv2D(64, 5, 5, border_mode='valid', input_shape=(img_rows, img_cols, 1)))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(2, 2), dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(5, 5),strides=(2, 2)))

model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf')) 
model.add(Conv2D(64, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf')) 
model.add(Conv2D(64, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(MaxPooling2D(pool_size=(3, 3),strides=(2, 2)))

model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(Conv2D(128, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(Conv2D(128, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(3, 3),strides=(2, 2)))

model.add(Flatten())
model.add(Dense(1024))
model.add(PReLU(init='zero', weights=None))
model.add(Dropout(0.2))
model.add(Dense(1024))
model.add(PReLU(init='zero', weights=None))
model.add(Dropout(0.2))

model.add(Dense(7))
model.add(Activation('softmax'))

ada = Adadelta(lr=0.1, rho=0.95, epsilon=1e-08)

I have several questions :

1/ How to choose the number of layers and their optimal parameters (Convolutions, Max Pooling, Dropout etc.) that gives the best performance (Accuracy), I mean based on what ?

2/ What's the relation between every layer in term of parameters (Kernel and filter size, strides etc.) ?

3/ Like I said, the images are 48 x 48 grayscale mode. Is it good ? Does it affect the performance ? Does my model architecture act good with this images ? Using bigger images or colored ones will improve the performance ?

score 1 · Answer 1 · answered May 21 '17 at 11:10

Answering 1). You won't know until you try different architectures. Still, it would pay of for you to automate the process. Try serializing the architecture or store different architectures in different files, under unique ID's. After you experimented, you will be able to find which one did best.

Answering 3). Colour would give you more features (R, G, B) instead of gray, giving your classifier more opportunity to select images correctly. However, it may also make your classifier more sensitive to changes in things like colour-balance (in other words, same face, but pictures taken using different settings). I would try with gray-scale images first, before I would triple the amount of features.

Look this one up: https://stackoverflow.com/questions/42240489/in-what-ways-are-the-output-of-neural-network-layers-useful?rq=1 --it gives a nice explanation to what you might be thinking of. — OZ13, May 22 '17 at 20:42

Convolution Neural Network model architecture

1 Answers1