I am trying to train a multi-task multi-label classifier using Keras. The output layer is a fork of two outputs. The task of each output layer is to predict the categories of its task. The y vectors are OneHot encoded.
I am using a custom generator for my data that yields the y
arrays in a list to the fit_generator
function
I am using a categorigal_crossentropy
loss function at each layer
fork1.compile(loss={'O1': 'categorical_crossentropy', 'O2': 'categorical_crossentropy'},
optimizer=optimizers.Adam(lr=0.001),
metrics=['accuracy'])
The problem: The loss doesn't decrease with this setup. However, if I train each task separately, I have low loss and high accuracy. So what could be the problem ?