I'm working on Facial expressions recognition with Keras.
I've a dataset with 72 000 images. I'm using 80% for Train, 10% for Validation and 10 % for Test.
All the images are 48 x 48 on grayscale mode.
My model architecture is like this :
model = Sequential()
model.add(Conv2D(64, 5, 5, border_mode='valid', input_shape=(img_rows, img_cols, 1)))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(2, 2), dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(5, 5),strides=(2, 2)))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(Conv2D(64, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(Conv2D(64, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(MaxPooling2D(pool_size=(3, 3),strides=(2, 2)))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(Conv2D(128, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(Conv2D(128, 3, 3))
model.add(PReLU(init='zero', weights=None))
model.add(ZeroPadding2D(padding=(1, 1), dim_ordering='tf'))
model.add(MaxPooling2D(pool_size=(3, 3),strides=(2, 2)))
model.add(Flatten())
model.add(Dense(1024))
model.add(PReLU(init='zero', weights=None))
model.add(Dropout(0.2))
model.add(Dense(1024))
model.add(PReLU(init='zero', weights=None))
model.add(Dropout(0.2))
model.add(Dense(7))
model.add(Activation('softmax'))
ada = Adadelta(lr=0.1, rho=0.95, epsilon=1e-08)
I have several questions :
1/ How to choose the number of layers and their optimal parameters (Convolutions, Max Pooling, Dropout etc.) that gives the best performance (Accuracy), I mean based on what ?
2/ What's the relation between every layer in term of parameters (Kernel and filter size, strides etc.) ?
3/ Like I said, the images are 48 x 48 grayscale mode. Is it good ? Does it affect the performance ? Does my model architecture act good with this images ? Using bigger images or colored ones will improve the performance ?